Full text

Turn on search term navigation

INTRODUCTION

In the early detection of cracks, cracks are detected manually. From the classification of cracks, it can be seen that cracks have various shapes, large coverage area, different extension lengths, and uneven width, which leads to time-consuming and laborious manual detection and cannot guarantee the speed and Precision of detection. In recent years, with the rapid development of computer technology and Machine Learning (ML), many fields are combined with it. Especially in recent years, it is also widely used in road crack detection, producing many automatic detection algorithms. Traditional automatic detection methods include Canny algorithm based on threshold segmentation¹ and Otsu method.² However, due to the complex road surface characteristics and pavement environment, as well as the universality and robustness of traditional Canny algorithm and Otsu algorithm, the precision of detection results is not very high. Then came the path search algorithm based on minimum cost,³ detection algorithm based on support vector machine (SVM),⁴ Crack Tree detection algorithm,⁵ and so forth. These algorithms have solved part of the problem of low Precision, but there is still the problem of spending a lot of time when detecting Precision. Moreover, the design of detection algorithm and Crack Tree algorithm based on SVM support vector machine is complicated, which leads to these algorithms cannot be put into use. Based on these problems, and according to the recently popular machine learning technology,⁶ a corresponding automatic crack identification and detection algorithm is developed based on deep learning and neural network. In addition, the crack detection results can also be used for pavement condition monitoring and road maintenance strategy determination. Therefore, pavement crack detection will greatly affect the quality of road monitoring and maintenance automation, so it is necessary to further improve its Precision and detection speed.

At present, the research direction of road pavement crack recognition can be roughly divided into two categories. The first type is based on digital image processing, which is mainly based on the artificial identification of features, using many feature laws such as frequency, edge, HOG, gray level, texture, and entropy to design some feature recognition conditions to limit and complete the recognition. The second kind is based on deep learning, which establishes Convolutional Neural Network (CNN) and uses the network to realize automatic feature recognition of the data, so that the network constantly adjusts itself according to certain rules to realize that the input data and output are equal to or infinitely close to the labels. In this paper, the convolutional network based on deep learning is established to realize automatic recognition and detection of pavement cracks. In order to realize more accurate automatic identification of pavement cracks and improve pavement detection effect, literature⁷ proposes an attention-based crack detection network (ACNet), which realizes more accurate crack location in subjective vision. The details are richer, and the experimental indexes F1 and coincidence rate have been significantly improved. However, there is still a certain gap between the detection speed of the algorithm model used in this paper and the actual detection requirements. It is found in literature⁸ that some cracks with narrow widths are easy to be interrupted before the detection is completed, and the edge information of cracks may be lost during filtering or pooling. Therefore, the authors proposed a framework based on SegNet based on these two findings, then designed the coding layer as a continuous attention mechanism, and added a convolutional pyramid structure before the decoding layer, so as to reduce the fracture in crack detection and obtain more and more complete crack edge information. The progress of this method was evaluated by Precision, Recall, and comprehensive evaluation index (F1-measure, α = 1), and it was found that compared with relevant methods, the improvement was respectively improved. However, compared with literature, this method does not adopt dilated convolution layers to replace two pooling layers, so the experimental effect obtained is not as good as that obtained in literature.⁹ Lovász hinge loss was also introduced in literature, so that the final crack detection result was relatively good. In addition to the above attention-mechanism-based algorithm, there are also Region proposal methods for crack detection. The region proposal method mainly consists of R-CNN,¹⁰ Fast R-CNN,¹¹ Faster R-CNN¹² and RepPoints.¹³ These methods use texture, color, edge and other information to determine the potential location of objects in advance, and then use CNN to classify and extract the features of these locations. The advantage of this method is that the detection Precision is high, but the detection speed decreases obviously. The main reason is that the above target detection method is completed through two stages, so the detection speed is slow.

With the continuous development of neural network, researchers use regression method to simplify the detection process, thus greatly speeding up the detection speed. The regression method is converted from two stages to one stage by using single-end to end CNN to directly predict the position and category of the boundary box of the object in multiple locations. Therefore, regression method and regional proposal method are also called one-stage method and two-stage method respectively. First-stage methods include You Only Look Once (YOLO),^14–16 Single Shot MultiBox Detector (SSD),^17,18 RetinaNet,¹⁹ and CenterNet.¹⁷ YOLO v3 is one of the most popular single-stage detection methods. YOLO v3 achieves a good balance between fast detection speed and high detection Precision in the first-order detection model. YOLO v3 has been successfully used in agriculture, geology, remote sensing, and medicine. In addition, it is widely used in transportation, such as traffic signs, traffic flow, potholes, visual cracks, and other detection. Recently, the YOLO series has been updated and now includes two new versions, YOLO v4²⁰ and YOLO v5. Both versions integrate state-of-the-art methods to become more efficient and suitable for object detection. The author of Reference 21 improved the prediction Precision of small cracks by improving the multi-scale prediction model of YOLO v3 network. However, due to the complexity of crack distribution, the Precision of this kind of crack detection still needs to be further studied. In literature,²² tens of thousands of pavement crack data sets were collected by industrial cameras, and YOLO v3 was used to detect pavement cracks. It was found that the larger the data set, the higher the detection Precision. Therefore, a deeper convolutional layer network was needed to reduce the parameters of detection data and improve the Precision. The literature¹⁹ combines the YOLO v3 algorithm with MobileNets and the convolutional block attention module (CBAM). The improved YOLO v3 not only reduces the complexity and parameters of network detection, but also improves the Precision of network prediction. However, the operation of the equipment device requirements are high, so there is a need to further improve the network. In literature,²³ YOLOv4 was applied to the detection of bridge surface cracks, and some lightweight networks were used to replace the feature extraction network in YOLOv4, which could reduce the number of parameters and the number of backbone network layers. The experimental results showed that, the precision, recall rate and F1 value of this method are 93.96%, 90.12%, and 92% higher than other advanced methods, respectively. The model occupies only 23.4 MB of memory space and achieves 140.2 FPS. It can be deployed on some mobile hardware platforms with limited computing power to meet the requirements of actual applications.

This paper will discuss and analyze YOLOv5 and apply it to pavement crack detection. The article structure of this paper is divided into the following sections: the Section 1 introduces the related work such as processing datasets; the Section 2 then introduces the basic principles of the YOLO v5 algorithm and the principle of the Attention Mechanism introduced; in the Section 3, we will demonstrate the crack detection model; the final section concludes and discusses the article.

MATERIALS AND METHODS

Data set processing

The purpose of this study is to use the latest target detection algorithm to automatically identify cracks, so more cracks need to be collected. In this paper, the crack pictures in literature^24,25 are made into a new dataset, which collects more than 3022 samples, and randomly divides this dataset into training set, test set and verification set according to the ratio of 7:2:1, which are 2115 training sets, 605 test sets, and 302 verification sets, respectively, with a pixel density of 300 dpi. The size of the images in the dataset varies, but since the input photo size in the YOLO network is generally set to a multiple of 32, this article automatically modifies the crack image size to 640 × 640 through the program when building the network. Before entering the crack pictures, we label all the photos in the dataset with the annotation software LabelImg, the categories are HCrack (lateral cracking), 30°Crack, 45°Crack, Lcrack (longitudinal cracking), HoneyCrack (honeycomb crack), and the label type is saved in TXT format.

The annotation step is to first establish the folder where the dataset is stored, and a total of two subfolders are established under the folder datasets, which are images and labels, and the images and labels folder includes three folders, namely train, test, val, where images are used to store the crack pictures that need to be used, and labels store the corresponding label files and index files. Then, in the environment of Pytorch-GPU, the LabelImg annotation software was opened and the rectangular frame of the crack photo was labeled.

Equipment and software

The hardware CPU used to build the deep learning road crack custom recognition model based on the YOLO v5 algorithm is Intel Intel® Xeon® Platinum 8260 with 128 G memory, and the graphics card (GPU) is NVIDIA GeForce RTX 2080 Ti.

The language used is Python, and the environment is configured as Pytorch-GPU. To configure the Python environment, install Anaconda and use Anaconda to manage the different environments. Next, install Cudnn and CUDA. Since the environment used in this article is Pytorch-GPU and the computer graphics card used in this article is the 20 series, we need to install CUDA10.2. Then start cmd in win+R and activate the Pytorch-GPU environment in the window, and the corresponding Python version is Python3.8. When this is done, we open Anaconda, change the environment, and download the editor VSCode inside.

YOLO v5 network algorithm

The YOLO v5 algorithm has four components: input, backbone, feature pyramid networks (FPN), and YOLO Head. Backbone is the backbone feature extraction network of YOLO v5, and in YOLO v5, backbone uses CSP-Darknet53, and a residual module is added to this part, as shown in Figure 1. The crack image is input into the YOLO v5 network, and then the image is extracted in CSP-Darknet53 to obtain the feature layer of the input image.^26–30 In Backbone, we perform feature fusion on the three effective feature layers obtained by strengthening the feature extraction network FPN, and the purpose of feature fusion here is to combine feature information of different scales. And in the FPN part, it is necessary to continue feature extraction, and then UpSampling feature fusion and DownSampling feature fusion through PANet.^15,31–33 YOLO Head is a classifier and regressor for YOLO v5, which can be seen as Figure 2.

[IMAGE OMITTED. SEE PDF]

Compared with the widely used YOLOv3, YOLOv5 has been improved in the following areas: Data augmentation, including Mosaic Data Augmentation and Mosaic utilizes four images for stitching to achieve in-data augmentation. According to the literature, it can be seen that it has a great advantage that it can enrich the background of the detected object, and when the Batch Normalization calculation is performed, the data of four pictures can be calculated at once. Multi-positive sample matching. In the previous YOLO series, each real box corresponds to a positive sample during training, that is, at training time, each real box is predicted by only one prior box. In order to speed up the training efficiency of the model, YOLO v5 increases the number of positive samples, and each real box can be predicted by multiple prior boxes during training.

Attention mechanism

The Attention Mechanism³⁴ is machine vision based on human vision, mainly including Soft attention and Hard attention. Select SENet and CABM in Soft attention related to channels and spaces. SENet is through Squeeze and Excitation two modules to achieve attention to the feature channel attention mechanism, first use Squeeze to pool each feature map globally, average into a real value, and then through Excitation operation, at this time the network outputs a feature map of 1 × 1 × C size, this Excitation step improves important features and suppresses unimportant features. Convolutional Block Attention Module (CABM) that acquires the importance of each feature channel by learning similar to the SENet attention mechanism, and CBAM also has the importance of acquiring each feature space by itself through similar learning methods. The channel attention principle and spatial attention principle of CBAM are shown in Figures 3 and 4.^35–37

[IMAGE OMITTED. SEE PDF]

Improve the backbone feature extraction network

CSPLayer-Darknet53 is the backbone feature extraction network of YOLOv5s, where the CSPLayer module used is denoted as C3. C3 uses three Conv_BN_SiLU modules, first the picture is input into the backbone network, after focus and convolution, input into the C3 module, in this module, first through two channels for convolution, standard normalization and activation function processing, one of the channels will also pass through the bottleneck module, and then stack the extracted features, and finally the stacked results are convolved again, standard normalization and activation function processing, combined with the C3 module processing flow, You can add an attention mechanism in the bottleneck module. Adding the attention mechanism CABM to the C3 layer in the backbone structure of YOLOv5s can not only better extract the features of cracks from the channel, but also self-learn to extract the features of cracks in the spatial dimension, and obtain a new C3CABM layer, which can better learn the characteristics of cracks. And the introduction of SENet attention module at the ninth layer of the backbone network, the features extracted from the entire backbone structure can be self-learned by the SENet module, which can improve the recall rate. Figure 5 shows the original C3 module and the improved C3CABM module, and Figure 6 shows the improved YOLOv5s-attention backbone structure.

[IMAGE OMITTED. SEE PDF]

Pavement crack detection based on improved YOLO v5s

The specific operation of network learning is to first input the crack image with a size of 640 × 640 into the backbone structure C3CABMDarknet, then transfer the features extracted from the trunk structure to the Neck structure for feature fusion, and extract the incoming crack features again, and finally pass them into YOLOHead for prediction. Specifically, the basic hyperparameters in the model are, epochs = 50, batch_size = 32, learning rate = 0.1, and optimizer is SGD optimizer. Figure 7 is an improved YOLOv5s-attention network flowchart.

[IMAGE OMITTED. SEE PDF]

TRAINING RESULTS AND EVALUATION

Model training

First, the processed crack dataset is placed in the root directory, and the crack dataset is in TXT format. Run the train.py, start network training, and the training dataset is the crack picture after processing. The weights obtained from training are automatically placed in the run folder. The YOLOv3, the original YOLOv5s, YOLOv5s-attention, and other improved YOLOv5s models are trained 300 epoch without YOLOv5s pre-weights, and then the corresponding values are obtained.

Comprehensive evaluation index (F1)

The comprehensive evaluation index F1 is the harmonic average of Precision and Recall. Precision is the ratio between the number of detected features and the number of all features detected, measuring the precision of the model, recall is the ratio of the number of detected features and the number of all features of the type in the data set, measuring the recall rate of the model detection system. Next, the calculation formula of Precision and Recall, and comprehensive evaluation index F1 is listed,³⁸ and in general, a higher F1 indicates that the model is more effective. 1 $\mathrm{Precision}=\frac{\mathrm{True}\ \mathrm{Positive}}{\mathrm{True}\ \mathrm{Positive}+\mathrm{False}\ \mathrm{Positive}}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}},$ 2 $\mathrm{Recall}=\frac{\mathrm{True}\ \mathrm{Positive}}{\mathrm{True}\ \mathrm{Positive}+\mathrm{False}\ \mathrm{Negative}}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}.$ When α = 1, F is F1, that is, 3 $F1=\frac{2^{\ast }{P}^{\ast }R}{\alpha^2\left(P+R\right)}.$

IoU (Intersection over Union)

IoU is the intersection of ground truth bounding box and predicted bounding box than the union of ground truth bounding box and predicted bounding box, is used to determine the Precision of target detection a standard, and then set a threshold, this paper IoU is set to 0.65, the model will automatically compare the calculated IoU with the threshold.

Mean Average Precision (mAP)

Average Precision mean mAP is the average of the Average Precision (AP) of all target categories, the AP is used to measure the quality of a certain type of detection effect, in a certain type of crack detection is the identification Precision, and the mAP is to measure the detection effect of multiple categories, that is, all crack recognition effects. 4 $\mathrm{AP}=\sum \limits_{i=0}^{n-1}\left({r}_{i+1}-{r}_i\right){P}_i\left({r}_i+1\right),$ where, r₁, r₂…r_n is the Recall value corresponding to the first interpolation of the Precision interpolation segment in ascending order, and P_i represents the i detected precision. 5 $\mathrm{mAP}=\frac{\sum \mathrm{AP}}{N\left(\mathrm{classes}\right)}.$

Evaluation of training results

The crack dataset was trained on Dino, Faster-RCNN, YOLO v3, YOLO v5s, and the improved YOLO models, and then the Precision, Recall, and comprehensive evaluation index (F1-measure, at this time α = 1) of these models were compared, see Table 1.

TABLE 1 Comparison of YOLO network results.

Module	Evaluation
Precision (%)	Recall (%)	F1 (%)	Map (%)
Dino	50.9	52.8	54.6	45.8
Faster-RCNN	53.4	52.4	55.9	54.8
YOLOv3	43.0	50.7	46.5	30.7
YOLOv5s	55.3	59.0	57.1	53.7
YOLOv5s-SE(1)	56.9	57.1	57.0	56.9
YOLOv5s-SE(4)	57.3	54.7	56.0	54.1
YOLOv5s-C3CABM-SimSPPF	58.0	57.0	57.5	53.9
YOLOv5s-Attention (Ours)	56.4	59.8	58.1	55.7

As can be seen from Table 1, the Precision of YOLOv5s is 55.3%, the Recall is 59.0%, the Precision of YOLOv5s-C3CABM-SimSPPF is 58.0%, the Recall is 57%, and the Precision YOLOv5s-Attention is 56.4%, and the Recall is 59.8%. It can be seen that the improved YOLOv5s-C3CABM-SimSPPF model has relatively high precision, but the Recall is lower than that of the YOLOv5s model, that is, the number of correct recalls in the correct detection results is less, so that the F1 value is less than the YOLO model, and the SimSPPF module in YOLOv5s-C3CABM-SimSPPF model is introduced into the YOLOv6,³³ which does not increase the calculation parameters and reduces the operation speed. However, after training, the recall rate is not too high, resulting in an increase in error detection rate, so the SimSPPF module is removed, and the attention module SENet for feature channel self-learning is added before the SPPF layer, so as to obtain the YOLOv5s-Attention model, and it can be verified that the Precision of this model is improved by 1.0% and the recall rate is increased by 0.7% compared with the original YOLOv5s model.

mAP (IoU = 0.5) was used to measure the overall detection effect of these five types of cracks for evaluating the performance of YOLOv5 series modules. It can be seen from Table 1 that the YOLOv5s-SE model adds a feature channel self-learning attention mechanism SENet in the ninth layer of the YOLOv5s backbone feature extraction network, and the mAP value reaches 56.9%, but the recall rate of this model is low, resulting in a high error detection rate of cracks, while the YOLOv5s-Attention model performs better than YOLOv5s in all aspects, and the mAP value is increased by 1.8%. Figures 8–10 shows the F1 of each model.

[IMAGE OMITTED. SEE PDF]

Figures 11–13 are the crack identification results of YOLOv5s, YOLOv5s-C3CABM-SimSPPF, and YOLOv5s-Attention. From the crack identification results, it can be seen that the YOLOv5s-Attention model can better identify 30° cracks, can better learn the difference between 45° cracks and 30° cracks, and the ability to identify transverse cracks, longitudinal cracks and cracks is not inferior to YOLOv5s. This is because the attention mechanisms CABM and SENet are added to the original YOLOv5s backbone extraction network, which adds self-learning capabilities in space and feature channels.

[IMAGE OMITTED. SEE PDF]

CONCLUSIONS

This paper proposes to use neural network algorithms to automatically identify cracks for the problem of pavement cracks. First, the results of YOLO series network and lightweight network SSD on VOC object detection on public dataset are discussed, and it is found that YOLOv3-v5 has higher Precision in object detection. Second, the speed and Precision of YOLOv3, YOLOv5s, and YOLOv5s-attention in crack recognition are compared, and the superiority of the network algorithm is evaluated by using Precision, Recall, and F1. Through network training comparison, YOLOv5s_attention network stands out from the target detection network. The YOLOv5s_attention network adds the attention module CABM to the C3 layer in the backbone structure backbone, and the attention module SENet between the SPPF layer and the C3 layer. From the training, it can be seen that the network speed with the attention mechanism has little effect, the average training time is 3 h, and the F1 value increases by 0.9%.

Using YOLOv5 algorithm to detect pavement cracks as far as traditional manual detection methods are concerned, the detection cost is reduced, the detection speed is accelerated and the detection Precision is improved. And with the development of science and technology, it is a trend to realize automatic detection of cracks, and many old urban areas and national road pavement cracks are especially serious, so strengthening automatic detection of cracks can efficiently deal with cracks. The next step is to further improve the YOLOv5 network, making the improved YOLOv5 model lighter, faster detection, and can be installed on small mobile devices to detect cracks in real time, and repair them as they go to avoid the persistence of cracks.

AUTHOR CONTRIBUTIONS

Yang Ding: Funding acquisition; writing – review and editing; supervision; project administration; methodology. Min-Li Lan: Writing – original draft; formal analysis. Dan Yang: Methodology; writing – original draft; data curation. Shuang-Xi Zhou: Writing – review and editing; investigation; conceptualization.

FUNDING INFORMATION

The work described in this paper was jointly supported by the Training plan for academic and technical leaders of major disciplines in Jiangxi Province (grant no. 20213BCJL22039), Natural Science Foundation of China (grant no. 52163034), Jiangxi Province Natural Science Youth Fund (grant no. 20202BABL214043), and Engineering science and technology project of Jiangxi Provincial Department of Transportation (2021C0008, 2022H0014).

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT

Data is contained within the article.

References

Malek K, Mohammadkhorasani A, Moreu F. Methodology to integrate augmented reality and pattern recognition for crack detection. Comput Aided Civ Inf Eng. 2023;38(8):1000‐1019.

Pham MV, Ha YS, Kim YT. Automatic detection and measurement of ground crack propagation using deep learning networks and an image processing technique. Measurement. 2023;215: [eLocator: 112832].

de León G, Fiorentini N, Leandri P, Losa M. A new region‐based minimal path selection algorithm for crack detection and ground truth labeling exploiting Gabor filters. Remote Sens (Basel). 2023;15(11):2722.

Sarkar K, Shiuly A, Dhal KG. Revolutionizing concrete analysis: an in‐depth survey of AI‐powered insights with image‐centric approaches on comprehensive quality control, advanced crack detection and concrete property exploration. Construct Build Mater. 2024;411: [eLocator: 134212].

Ai D, Jiang G, Lam SK, He P, Li C. Computer vision framework for crack detection of civil infrastructure—a review. Eng Appl Artif Intel. 2023;117: [eLocator: 105478].

Laxman KC, Tabassum N, Ai L, Cole C, Ziehl P. Automated crack detection and crack depth prediction for reinforced concrete structures using deep learning. Construct Build Mater. 2023;370: [eLocator: 130709].

Chen W, He Z, Zhang J. Online monitoring of crack dynamic development using attention‐based deep networks. Autom Construct. 2023;154: [eLocator: 105022].

Ranyal E, Sadhu A, Jain K. Enhancing pavement health assessment: an attention‐based approach for accurate crack detection, measurement, and mapping. Exp Syst Appl. 2024;247: [eLocator: 123314].

Ma M, Yang L, Liu Y, Yu H. An attention‐based progressive fusion network for pixelwise pavement crack detection. Measurement. 2024;226: [eLocator: 114159].

Li R, Yu J, Li F, Yang R, Wang Y, Peng Z. Automatic bridge crack detection using unmanned aerial vehicle and faster R‐CNN. Construct Build Mater. 2023;362: [eLocator: 129659].

Liu Z, Yeoh JK, Gu X, et al. Automatic pixel‐level detection of vertical cracks in asphalt pavement based on GPR investigation and improved mask R‐CNN. Autom Construct. 2023;146: [eLocator: 104689].

Zhou S, Pan Y, Huang X, Yang D, Ding Y, Duan R. Crack texture feature identification of fiber reinforced concrete based on deep learning. Materials. 2022;15(11):3940.

Ding Y, Zhou SX, Yuan HQ, et al. Crack identification method of steel fiber reinforced concrete based on deep learning: a comparative study and shared crack database. Adv Mater Sci Eng. 2021;2021:1‐10.

Su P, Han H, Liu M, Yang T, Liu S. MOD‐YOLO: rethinking the YOLO architecture at the level of feature information and applying it to crack detection. Exp Syst Appl. 2024;237: [eLocator: 121346].

Qiu Q, Lau D. Real‐time detection of cracks in tiled sidewalks using YOLO‐based method applied to unmanned aerial vehicle (UAV) images. Autom Construct. 2023;147: [eLocator: 104745].

He X, Tang Z, Deng Y, Zhou G, Wang Y, Li L. UAV‐based road crack object‐detection algorithm. Autom Construct. 2023;154: [eLocator: 105014].

Zhu W, Zhang H, Eastwood J, Qi X, Jia J, Cao Y. Concrete crack detection using lightweight attention feature fusion single shot multibox detector. Knowl‐Based Syst. 2023;261: [eLocator: 110216].

Li H, Wang C, Wang J, Zhang Y. Crack detection based on deep learning: a method for evaluating the object detection networks considering the random fractal of crack. Struct Health Monit. 2023;22(4):2547‐2564.

Tran TS, Nguyen SD, Lee HJ, Tran VP. Advanced crack detection and segmentation on bridge decks using deep learning. Construct Build Mater. 2023;400: [eLocator: 132839].

Zhou Z, Zhang J, Gong C, Wu W. Automatic tunnel lining crack detection via deep learning with generative adversarial network‐based data augmentation. Undergr Space. 2023;9:140‐154.

Wen X, Li S, Yu H, He Y. Multi‐scale context feature and cross‐attention network‐enabled system and software‐based for pavement crack detection. Eng Appl Artif Intel. 2024;127: [eLocator: 107328].

Mishra M, Jain V, Singh SK, Maity D. Two‐stage method based on the you only look once framework and image segmentation for crack detection in concrete structures. Archit Struct Construct. 2023;3(4):429‐446.

Kao SP, Chang YC, Wang FL. Combining the YOLOv4 deep learning model with UAV imagery processing technology in the extraction and quantization of cracks in bridges. Sensors. 2023;23(5):2572.

Li LF, Ma WF, Li L, Lu CJAAS. Research on detection algorithm for bridge cracks based on deep learning. Acta Autom Sin. 2019;45(9):1727‐1742.

Liu Y, Yao J, Lu X, Xie R, Li L. DeepCrack: a deep hierarchical feature learning architecture for crack segmentation. Neurocomputing. 2019;338:139‐153.

Shehata HM, Mohamed YS, Abdellatif M, Awad TH. Crack width estimation using feed and cascade forward back propagation artificial neural networks. Key Eng Mater. 2018;786:293‐301.

Hu N, Yang J, Xi X, Fan X. Few‐shot crack detection based on image processing and improved YOLOv5. J Civ Struct Health Monit. 2023;13(1):165‐180.

Jia Z, Su X, Ma G, Dai T, Sun J. Crack identification for marine engineering equipment based on improved SSD and YOLOv5. Ocean Eng. 2023;268: [eLocator: 113534].

Zhou K, Lei D, Chun PJ, et al. Evaluation of BFRP strengthening and repairing effects on concrete beams using DIC and YOLO‐v5 object detection algorithm. Construct Build Mater. 2024;411: [eLocator: 134594].

Xu G, Yue Q, Liu X. Deep learning algorithm for real‐time automatic crack detection, segmentation, qualification. Eng Appl Artif Intel. 2023;126: [eLocator: 107085].

Ong JC, Lau SL, Ismadi MZ, Wang X. Feature pyramid network with self‐guided attention refinement module for crack segmentation. Struct Health Monit. 2023;22(1):672‐688.

Shi P, Zhu F, Xin Y, Shao S. U2CrackNet: a deeper architecture with two‐level nested U‐structure for pavement crack detection. Struct Health Monit. 2023;22(4):2910‐2921.

Manjunatha P, Masri SF, Nakano A, Wellford LC. CrackDenseLinkNet: a deep convolutional neural network for semantic segmentation of cracks on concrete surface images. Struct Health Monit. 2024;23(2):796‐817.

Song D, Shen J, Ma T, Xu F. Two‐level fusion of multi‐sensor information for compressor blade crack detection based on self‐attention mechanism. Struct Health Monit. 2023;22(3):1911‐1926.

Hang J, Wu Y, Li Y, Lai T, Zhang J, Li Y. A deep learning semantic segmentation network with attention mechanism for concrete crack detection. Struct Health Monit. 2023;22(5):3006‐3026.

Zeng Q, Fan G, Wang D, Tao W, Liu A. A systematic approach to pixel‐level crack detection and localization with a feature fusion attention network and 3D reconstruction. Eng Struct. 2024;300: [eLocator: 117219].

Yang L, Bai S, Liu Y, Yu H. Multi‐scale triple‐attention network for pixelwise crack segmentation. Autom Construct. 2023;150: [eLocator: 104853].

Morgese M, Wang C, Taylor T, Etemadi M, Ansari F. Distributed detection and quantification of cracks in operating large bridges. J Bridge Eng. 2024;29(1): [eLocator: 04023101].

Word count: 4733

Show less

© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

In order to reduce the manual workload and reduce the maintenance cost, it is particularly important to realize automatic detection of cracks. Aiming at the problems of poor real‐time performance and low precision of traditional pavement crack detection, a crack detection method based on improved YOLOv5 one‐step target detection algorithm of convolutional neural network is proposed by using the advantages of depth learning network in target detection. The images were manually marked with LabelImg annotation software, and then the network model parameters were obtained through improving the YOLOv5 network training. Finally, the cracks are verified and detected by the established model. In addition, the precision and speed of crack detection using YOLOv3, YOLOv5s, and YOLOv5s‐attention models are compared by using Precision, Recall, and F1. After comparison, it is found that the detection precision of YOLOv5s‐attention is improved by 1.0%, F1 by 0.9%, and [email protected] by 1.8%.

Details

Title

Crack detection based on attention mechanism with YOLOv5

Author

Lan, Min‐Li¹; Yang, Dan²; Zhou, Shuang‐Xi³; Ding, Yang⁴

¹ Fujian Chuanzheng Communications College, Fuzhou, China
² School of Civil Engineering and Architecture, East China Jiaotong University, Nanchang, China
³ School of Civil Engineering and Architecture, East China Jiaotong University, Nanchang, China, School of Civil Engineering and Management, Guangzhou Maritime University, Guangzhou, China
⁴ Department of Civil Engineering, Hangzhou City University, Hangzhou, China

Section

RESEARCH ARTICLE

Publication year

2025

Publication date

Jan 1, 2025

Publisher

John Wiley & Sons, Inc.

e-ISSN

25778196

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1002/eng2.12899

ProQuest document ID

3161575054

Crack detection based on attention mechanism with YOLOv5

Jump to:

Full text

Abstract

Details

Suggested sources