BLNN: Multiscale Feature Fusion-Based Bilinear

Full text

Turn on search term navigation

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Wood knot defect detection is an important link in evaluating wood quality, which ultimately affects the quality of wood products [1]. Rapid detection of knot defects on wood surface can effectively improve the qualified rate of wood products [2, 3]. Consequently, it is important to identify the defects of wood knots in a short time. Although manual recognition is accurate, it takes a lot of time to train the staff, and the recognition speed on the assembly line is very slow compared to machine recognition [4, 5]. With the development of artificial intelligence and computer vision technology, deep learning has potential significance in the application of wood knot defect classification [6–8].

In recent years, image recognition based on artificial neural network and image processing has been widely studied. In order to identify the target accurately, the first step is to extract image features. For example, a Hu invariant moment feature extraction method combined with a BP (back propagation) neural network to classify wood knot defects was proposed by Qi and Mu [9]. The accuracy of this method for wood knot defect recognition is over 86%. In the same year, Khwaja et al. proposed a defect detection and classification method for wet-blue leather using artificial neural network (ANN). The features of several defects on leather were extracted by using grey level cooccurrence matrix (GLCM) and grey level run-length matrix (GLRLM). The acquired features are passed to the multilayer perceptron using the Levenberg-Marquardt (LM) algorithm. The accuracy of this model is 97.85% [10]. In 2021, Aditya et al. proposed a method based on statistical texture features in GLCM to classify leaf blight of four plants by selecting appropriate thresholds. The accuracy of this method can reach 74% under optimal conditions [11]. The above methods require manual feature extraction, and the recognition rate is not high. Consequently, a convolutional neural network (CNN) which can automatically learn the target features is needed to replace the complex artificial defect feature extraction. In 2020, Zhang et al. proposed a CNN image recognition algorithm for supermarket shopping robots. This algorithm overcomes the problems of low accuracy and slow speed in image recognition. The experimental results show that the accuracy of the algorithm can reach more than 98%. It also verifies that the image recognition algorithm can be applied to supermarket shopping robots to meet the needs of competition [12]. In the same year, Liu et al. proposed an intangible cultural heritage image recognition model based on color feature extraction and CNN, with the recognition rate reaching 94.8% [13]. In 2021, a new method based on transfer learning and ResNet-34 convolutional neural network for recognizing wood knot defects was presented by Gao et al. The experimental results show that the classification accuracy of this method can reach 98.69% [14]. Although these methods are practical, their accuracy can still be improved, and they have less application in wood knot defect detection. In order to solve these problems, improve the accuracy and recognition speed of the model, and reduce the training time, a high-accuracy wood knot defect detection method based on convolutional neural network is required.

In this paper, a bilinear classification model based on feature fine-grained fusion strategy named BLNN was proposed to detect wood knot defects. This paper is arranged and structured as follows. Firstly, the dataset of wood knot defects is acquired and augmented. Then, the proposed BLNN model is introduced. Subsequently, the network is trained and tested by using the dataset of wood knot defects. Finally, based on a benchmark dataset, the test results are compared and analyzed with other deep learning models.

2. Materials and Methodology

2.1. Dataset Acquisition

The dataset was downloaded from the website of the Computer Laboratory, Department of Electrical Engineering, University of Oulu [15–17], and consists of 365 images with four types of spruce knot defects. These are dry knot, edge knot, leaf knot, and sound knot, respectively. Figure 1 shows the four types of wood knot defects in the dataset used in this paper.

[figures omitted; refer to PDF]

2.2. Image Preprocessing and Augmentation

Deep learning networks have to be trained on massive datasets to achieve good performance [18]. Therefore, when the original dataset contains a limited number of images, data augmentation [19] is required to improve accuracy and prevent overfitting [20]. In this case, six methods are employed to augment the dataset, namely, vertical mirroring, rotation by 180°, horizontal mirroring, adding Gaussian noise, increasing the hue by 10, and adding salt-and-pepper noise. Consequently, the number of images was increased to seven times the original number. Due to more image augmentation, the learning ability of the network has increased. The data augmentation is shown in Figure 2. Table 1 lists the names and the number of images used for the experiments. Eventually, the dataset was randomly divided into a training set, a validation set, and a testing set in ratio of 3 : 1 : 1.

[figures omitted; refer to PDF]

Table 1

Number of datasets.

Wood knot defect	Before data augmentation				After data augmentation
Wood knot defect	Training dataset	Validation dataset	Testing dataset	Original dataset	Training dataset	Validation dataset	Testing dataset	Total dataset
Dry knot	41	14	14	69	291	96	96	483
Edge knot	39	13	13	65	273	91	91	455
Leaf knot	27	10	10	47	198	65	66	329
Sound knot	110	37	37	184	772	266	250	1288
Total	217	74	74	365	1534	518	503	2555

2.3. Proposed Classification Model

A CNN network called BLNN is proposed for fine-grained feature extraction [21–23] based on images, which consists of two different branching convolutional neural networks. Since the two CNNs are different, they are used to extract features of different scales. These two features are confluence together to form a one-dimensional feature vector using the bilinear pooling operation [24, 25], and finally, the feature vector is classified using a classifier to obtain the recognized class. An overview of the proposed network architecture is shown in Figure 3. The parameters of BLNN are shown in Table 2.

[figure omitted; refer to PDF]

After the first fully connected layer, vectors $x_{1}$ and $x_{2}$ with a dimension of $1 \times 120$ are obtained from the two branches, respectively (Figure 4). Then, $x_{1}$ and $x_{2}$ cascade to get $x_{3}$ . Cascade fusion [32] is employed to superpose the two outputs, which can be expressed as follows: $\begin{matrix} (3) & f_{cas} x_{1}, x_{2}, \end{matrix}$ where $x_{1}$ and $x_{2}$ are the outputs behind Fc₁₁ and Fc₂₁, respectively. Two vectors are cascaded and spliced along the vertical axis into one vector with a dimension of $1 \times 240$ . Therefore, the vector $x_{3}$ contains all the eigenvectors computed by the two branches, which is computed from the image features of two different scales, and the features are represented more comprehensively. Next, a one-dimensional vector with a dimension of $1 \times 50$ is set after $x_{3}$ , and finally, set the output of the fully connected layer to 4, indicating the category of classification.

2.5. Loss Function and Optimizer

The loss function is applied to evaluate the difference between the predicted and actual values of the model [33–35]. The smaller the difference, the smaller the cross-entropy. This study uses the cross-entropy loss function, which is expressed as follows: $\begin{matrix} (4) & L = - \sum_{i = 1}^{n} p_{i} x \log q_{i} x, \end{matrix}$ where $L$ represents the loss value of the sample and $p_{i} x$ and $q_{i} x$ represent the target output and the actual output, respectively. Cross-entropy overcomes the problem that weights and deviations are updated too slowly. When the error is large, the weight updates quickly, and when the error is small, the weight updates slowly.

The optimizer is used to update and compute the network parameters that affect the model training and output to approximate or reach the optimal value, thereupon then minimizing (or maximizing) the loss function [36]. In this case, the Adam optimizer is used. The Adam optimizer combines the advantages of AdaGrad [37] and RMSProp [38]. It takes the first-order moment estimation (i.e., the mean of the gradient) and second-order moment estimation (i.e., the uncentered variance of the gradient) of the gradient into account and calculates the update step. Adam is simple to implement, is computationally efficient, and has low memory requirements, and the hyperparameters usually require no or little fine-tuning.

3. Experiment Results and Discussion

The experiment was performed on a Windows 10 64-bit PC equipped with an Intel(R) Xeon(R) Bronze 3204 CPU @ 1.90 GHz processor and 128 GB RAM. The deep learning programs were run on two NVIDIA GeForce RTX 3090 GPUs with 24 G RAM. The code is mainly implemented in Python, including data preprocessing and algorithm implementation. The deep learning framework is Pytorch. The experimental environment is shown in Table 3.

Table 3

Experimental environment.

Hardware environment		Software environment
Memory	128.00 GB	System	Windows 10
CPU	Intel(R) Xeon(R) Bronze 3204 CPU @ 1.90 GHz (6 core)	Environment configuration	Pytorch-gpu 1.8.0 + Python 3.8.8 + cuda 11.1 + cudnn 8.0.5
Graphics card	NVIDIA GeForce RTX 3090 (24 G)

3.1. Model Training

In this study, the dataset is divided into a training set, a validation set, and a testing set, which contain 1534, 518, and 503 images, respectively. The hyperparameter setting for model training is shown in Table 4. The epoch, batch size, and learning rate are set to 200, 128, and $1 e - 3$ to make all models converge stably. The model training process is shown in Figure 5.

Table 4

Training hyperparameters.

Related parameter	Value
Batch size	128
Learning rate	$1 e - 3$
Epoch	200
Optimizer	Adam
Loss function	Cross-entropy
CUDA	Enable
CUDNN	Enable

[figure omitted; refer to PDF]

3.1.1. The Training Results of the BLNN Model

The accuracy and loss curves for the training and verification stages are shown in Figure 6, respectively.

[figures omitted; refer to PDF]

Figure 6 shows that the model has trained 200 epochs; it can be seen that the training accuracy of the model remains stable after 50 epochs. Most of the fluctuations are between 0.95 and 1.00, and the loss decreases to around 0.2 to 0.35 with little fluctuation. After nearly 100 epochs, the loss of training phase decreased to about 0.2, but there are still fluctuations. The accuracy remained stable during the validation phase, most of which fluctuated between 0.95 and 1.00. Better classification results are obtained.

3.1.2. Contrast Experiment

The results of BLNN are compared with those of AlexNet, VGGNet-16, GoogLeNet, ResNet-18, and MobileNet-V2 to verify the effectiveness of the model. ResNet-18 achieves feature reuse by identity shortcut. Similar to ResNet, the fusion strategy of BLNN is to combine in-depth and shallow-depth features to obtain more detailed feature information. By comparing the performance of different network structures on the same wood knot defect dataset, the effectiveness and the superiority in identifying wood knot defects of BLNN are proved.

As shown in Figure 7, BLNN has a faster convergence rate than other models and finishes convergence at the 50th epoch. Consequently, a smaller epoch has the opportunity to be chosen to use in practice.

[figures omitted; refer to PDF]

Five learning rates, 0.1, 0.01, 0.001, 0.0001, and 0.00001, were tested after establishing the BLNN model. The experimental results are shown in Table 5.

Table 5

The comparison of results in different learning rates.

Leaning rate	Number	Accuracy (%)
0.1	250	49.70
0.01	475	94.43
0.001	499	99.20
0.0001	486	96.62
0.00001	436	86.68

In Table 5, it is observed that when the learning rate is 0.1, the model does not converge effectively. The main reason is that an excessively large learning rate will cause the parameters of the model to oscillate beyond the valid range rapidly. When the learning rate has been reduced to 0.01, 0.001, and 0.0001, good results have been achieved, the error has been converged, and test accuracy has reached 94.43%, 99.20%, and 96.62%, respectively. When the learning rate continues to drop to 0.00001, the network convergence is very slow and the time to find the optimal value increases. At the same time, convergence may occur when entering the local extreme point, and no optimal value can be found. By continuously reducing the learning rate, it is found that the training results of different learning rates are different. Consequently, considering the accuracy and training time of the model, 0.001 is chosen as the initial learning rate to train the model.

The optimization algorithm is applied to find the optimal solution of the model. In this case, the Adam is employed and compared with SGD, AdaGrad, and Adax, as shown in Figure 8. The results show that the model with Adam has the fastest convergence speed and the highest accuracy. Table 6 shows the prediction results of the four optimization algorithms under the same condition. The results show that the accuracy of SGD, AdaGrad, Adamax, and Adam is 79.32%, 94.04%, 98.01%, and 99.20%, respectively. Consequently, considering the accuracy and training time of the model, Adam is chosen as the optimizer of the model.

[figures omitted; refer to PDF]

Table 6

The comparison of results in different optimizers.

Optimizer	Number	Accuracy (%)
AdaGrad	473	94.04
Adamax	493	98.01
SGD	399	79.32
Adam	499	99.20

3.2. Evaluation Metrics

To evaluate the performance of the BLNN, the precision ( $P$ ), recall ( $R$ ), $F 1$ score ( $F 1$ ), and false alarm rate (FAR) were applied for the evaluation shown as follows: $\begin{matrix} (5a) & P = \frac{TP}{TP + FP}, \\ (5b) & R = \frac{TP}{TP + FN}, \\ (5c) & FAR = \frac{FP}{FP + TN}, \\ (5d) & F 1 = 2 \frac{P \cdot R}{P + R}, \end{matrix}$ where TP, FP, TN, and FN represent the true positive, false positive, true negative, and false negative.

3.3. Model Evaluation

The performance of BLNN is evaluated in the task of wood knot defect classification. 503 wood knot defect images were used as testing dataset. The trained BLNN was compared with AlexNet, GoogLeNet, MobileNet, ResNet-18, and VGGNet-16, and the network was evaluated according to confusion matrix, precision, recall, $F 1$ score, FAR, accuracy, training time, and detection time.

As shown in the confusion matrix in Figure 9, the accuracy of each category is described by comparing the actual category with the predicted category. The numerical distribution of confusion matrix shows that AlexNet and BLNN have better classification results. BLNN can recognize edge knot and sound knot up to 100%, and dry knot and leaf knot are slightly lower than AlexNet, which is the direction to improve in the future. However, as shown in Figure 10, BLNN has the highest overall recognition rate of knot defects, reaching 99.20%. Table 7 shows the training time and the detection time of all models for each wood image. It can be seen that BLNN has the shortest training time and the fastest detection speed in all models due to its fewer parameters and higher feature extraction ability.

[figures omitted; refer to PDF]

Table 7

Training time and detection time of all the applied methods.

Method	Training time (min)	Detection time (s/image)
AlexNet	37.32	0.2744
GoogLeNet	44.27	0.3519
MobileNet-V2	12.97	0.2425
ResNet-18	15.95	0.4573
VGGNet-16	36.88	1.9583
BLNN	11.22	0.0795

Precision, recall, $F 1$ , and FAR of the four categories of wood knot defect images in the testing set are shown in Figure 11. It can be seen that BLNN is superior to MobileNet-V2, ResNet-18, and VGGNet-16 in the classification of four wood knot defects. Compared with AlexNet and GoogLeNet, some of the BLNN metrics are slightly worse, but the gap is not big, which requires further improvement in the future. As shown in Figure 10 and Table 7, although BLNN is not always optimal in these models, BLNN has the highest accuracy and the fastest training time and detection speed, and it is easy to be built and embedded into other models because of its small parameters and computation, which makes it possible to identify wood knot defects. Compared with other models, BLNN has obvious advantages in accuracy and calculation, so it has more practical application value. An unexpected phenomenon is that MobileNet, ResNet-18, and VGGNet-16 do not achieve the desired performance, especially ResNet which has the lowest recognition rate. Therefore, the network structure has a great impact on the training results.

[figures omitted; refer to PDF]

As shown in Figure 3, BLNN consists of two single-branch networks. To verify the improvement of model performance by using two-branch networks, the upper and lower branches of BLNN are compared with BLNN, respectively. The results are shown in Figures 12 and 13.

[figures omitted; refer to PDF]

From Figures 12 and 13, it can be seen that BLNN has the fastest convergence speed and highest accuracy in the three networks. In addition, the convergence speed of the upper branch network in the training set is faster than that of the lower branch network, and the performance of the lower branch network in the verification set is better than that of the upper branch network. As shown in Figure 13, BLNN has the best performance, the lower network has the second performance, and the upper network has the worst performance, because the upper network uses $3 \times 3$ convolutional kernel, the lower network uses $8 \times 8$ convolutional kernel, and the lower network has a larger receptive field. Therefore, the bilinear structure of BLNN has better performance than that of single-branch networks.

As shown in Figure 3, BLNN has two single-branch networks. The upper and lower branch networks use different sizes of convolutional kernel; the upper branch network convolutional kernel is $3 \times 3$ , and the lower branch network convolutional kernel is $8 \times 8$ . To verify the effect of different convolutional kernel sizes on the model performance, we separately use BLNN (the upper branch network is $3 \times 3$ , the lower branch network is $8 \times 8$ ) compared with two networks with $3 \times 3$ and $8 \times 8$ ; the results are shown in Figures 14 and 15.

[figures omitted; refer to PDF]

From Figures 14 and 15, it can be seen that BLNN has the fastest convergence speed and highest accuracy in these three networks. In addition, the network with convolutional kernel size $3 \times 3$ in the training set converges faster than $8 \times 8$ , and the network with convolutional kernel size $8 \times 8$ in the verification set performs better than $3 \times 3$ . As shown in Figure 15, BLNN performs best, the network with convolutional kernel size $8 \times 8$ performs second, and the network with convolutional kernel size $3 \times 3$ performs worst. This is because networks with $8 \times 8$ convolutional kernel have a larger receptive field, but BLNN uses dual-branch networks with different sizes of convolutional kernel, smaller convolutional kernel ( $3 \times 3$ ) for upper branch networks to extract local details and larger convolutional kernel ( $8 \times 8$ ) for lower branch networks to extract more comprehensive global information, and then, these two kinds of feature information are fused. More comprehensive information can be acquired, so the performance of BLNN is better than that of the other two networks with different convolutional kernels.

3.4. Model Generalization

In order to evaluate the generalization ability of BLNN, we tested the classification ability of BLNN on some boards. Green means correct recognition was used to mark in green and the wrong recognition was marked in grey in this case. Details of the identification such as the name and probability of wood knot defects are displayed next to each label. Figure 16 shows four wood knot defects and the corresponding identification results.

[figure omitted; refer to PDF]

It can be seen that most of the wood knot defects in the image are correctly identified. Some of the wood knot defects are similar in shape to other defects, and some of the wood defects are not trained, which makes the model appear to identify errors. In most cases, our method (BLNN) still has high accuracy. This indicates that BLNN has certain application value in practice.

As shown in Figure 16, since we only focus on the four defects of dry knot, edge knot, leaf knot, and sound knot when training the network, it can be seen that there are some defects that have not been identified. This is one of our future research directions to increase the types of defect classification.

3.5. Discussion

The effectiveness of BLNN can be discussed in two aspects.

3.5.1. Feasibility of Bilinear Network Structure

Compared with single-branch network, BLNN has obvious advantages in accuracy and convergence speed, which proves that the classification ability of the network can be improved by extracting and fusing features from the bilinear network. This network extracts features from two parallel single-branch networks, which can make the extracted features more comprehensive. This is the key to improve the classification performance. Although classical network structures such as ResNet are generally single-branch networks, their features are relatively single. Bilinear network can extract more information than a single network.

3.5.2. Rationality of Using Different Convolutional Kernel Sizes

Compared with other classical networks, BLNN has obvious advantages in accuracy and computation, which proves that the classification ability of networks can be improved by fusing local features (convolutional kernel size $3 \times 3$ ) and global features (convolutional kernel size $8 \times 8$ ) through a bilinear fusion structure. The network uses convolutional kernel with different sizes to extract multiscale features from the same image, and this fine-grained information is the key to classification.

For the proposed BLNN network, the local and global features extracted by the convolutional layer are fused in the fully connected layer. In other words, it fuses all the features of different scales together through a fusion operation. Therefore, BLNN expands the number of features without generating many complex feature maps. In the fully connected layer, we improve the robustness and classification accuracy of the network by setting an appropriate number of neurons.

BLNN performs well in the classification of wood knot defects. However, performing network fusion operations in the fully connected layer may not be optimal for other tasks. This requires more research in the future.

4. Conclusion

In conclusion, a bilinear classification model based on feature fine-grained fusion strategy named BLNN was proposed in this case. The convolutional kernel size of the upper branch network of BLNN was set to $3 \times 3$ , and the convolutional kernel size of the lower branch network was set to $8 \times 8$ . Two different sizes of convolutional kernels were used to extract features at different scales, and feature fusion was used to classify the wood knot defects. 2052 images of wood knot defects were used for training after 200 training epochs. The experimental results show that the accuracy of BLNN reaches 99.20% during the testing phase. In addition, when wood knot defects are detected by this method, a large number of image preprocessing and manual feature extraction are not demanded, which greatly improves the recognition efficiency. The speed of defect detection is only 0.0795 s/image, and the training time is reduced. This means that BLNN has potential application value in wood nondestructive testing and wood knot defect detection and provides a feasible solution for future wood knot defect identification. In addition, the experimental results also show that multiscale information fusion is effective to improve model performance through network fusion.

Acknowledgments

This work was supported by the Foundation for Innovative Research Groups of the National Nature Science Foundation of China under Grant No. 51521003, NSFC under Contract No. 61571153 and No. 51173034, China National Postdoctoral Program for Innovative Talents (Grant No. BX2021092), China Postdoctoral Science Foundation (Grant No. 2021M690841), Heilongjiang Postdoctoral Fund (Grant No. LBH-Z20019), Aeronautical Science Foundation of China (No. 2020Z057077001), and Self-Planned Task of State Key Laboratory of Robotics and System (HIT), the Programme of Introducing Talents of Discipline of Universities (Grant No. B07108).

References

[1] Y. Fang, L. Lin, H. Feng, Z. Lu, G. Emms, "Review of the use of air-coupled ultrasonic technologies for nondestructive testing of wood and wood products," Computers and Electronics in Agriculture, vol. 137, pp. 79-87, DOI: 10.1016/j.compag.2017.03.015, 2017.

[2] W. Zhou, M. Fei, H. Zhou, K. Li, "A sparse representation based fast detection method for surface defect detection of bottle caps," Neurocomputing, vol. 123, pp. 406-414, DOI: 10.1016/j.neucom.2013.07.038, 2014.

[3] C. Todoroki, E. Lowell, D. Dykstra, "Automated knot detection with visual post-processing of Douglas-fir veneer images," Computers and Electronics in Agriculture, vol. 70 no. 1, pp. 163-171, 2010.

[4] D. Yadav, A. Yadav, "A novel convolutional neural network based model for recognition and classification of apple leaf diseases," Traitement du Signal, vol. 37 no. 6, 2020.

[5] X. Zhu, M. Zhu, H. Ren, "Method of plant leaf recognition based on improved deep convolutional neural network," Cognitive Systems Research, vol. 52, pp. 223-233, DOI: 10.1016/j.cogsys.2018.06.008, 2018.

[6] T. He, Y. Liu, Y. Yu, Q. Zhao, Z. Hu, "Application of deep convolutional neural network on feature extraction and detection of wood defects," Measurement, vol. 152, article 107357,DOI: 10.1016/j.measurement.2019.107357, 2020.

[7] J. Shi, Z. Li, T. Zhu, D. Wang, C. Ni, "Defect detection of industry wood veneer based on NAS and multi-channel mask R-CNN," Sensors, vol. 20 no. 16,DOI: 10.3390/s20164398, 2020.

[8] Y. Huang, J. Jing, Z. Wang, "Fabric defect segmentation method based on deep learning," IEEE Transactions on Instrumentation and Measurement, vol. 70, 2021.

[9] D. Qi, H. Mu, "Detection of wood defects types based on Hu invariant moments and BP neural network," Journal of Southeast University, vol. 43, pp. 63-66, 2013.

[10] K. Mohammed, S. K. S, P. G, "Defective texture classification using optimized neural network structure," Pattern Recognition Letters, vol. 135, pp. 228-236, DOI: 10.1016/j.patrec.2020.04.017, 2020.

[11] A. Sinha, R. Singh Shekhawat, "A novel image classification technique for spot and blight diseases in plant leaves," The Imaging Science Journal, vol. 5, 2021.

[12] X. Zhang, H. Lu, Q. Xu, X. Peng, Y. Li, L. Liu, Z. Dai, W. Zhang, "Image recognition of supermarket shopping robot based on CNN," 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 1363-1368, DOI: 10.1109/icaica50127.2020.9181936, .

[13] E. Liu, "Research on image recognition of intangible cultural heritage based on CNN and wireless network," EURASIP Journal on Wireless Communications and Networking, vol. 2020 no. 1,DOI: 10.1186/s13638-020-01859-2, 2020.

[14] M. Gao, D. Qi, H. Mu, J. Chen, "A transfer residual neural network based on ResNet-34 for detection of wood knot defects," Forests, vol. 12 no. 2,DOI: 10.3390/f12020212, 2021.

[15] H. Kauppinen, O. Silven, "A color vision approach for grading lumber," Theory & Applications of Image Processing II-Selected Papers from the 9th Scandinavian Conference on Image Analysis, pp. 367-379, .

[16] O. Silven, H. Kauppinen, "Recent developments in wood inspection," International Journal of Pattern Recognition and Artificial Intelligence, vol. 10 no. 1, pp. 83-95, DOI: 10.1142/S0218001496000086, 1996.

[17] H. Kauppinen, O. Silven, "The effect of illumination variations on color-based wood defect classification," Proceedings of the 13th International Conference on Pattern Recognition (13th ICPR), pp. 828-832, DOI: 10.1109/icpr.1996.547284, .

[18] G. Folego, M. Weiler, R. Casseb, R. Pires, A. Rocha, "Alzheimer’s disease detection through whole-brain 3D-CNN MRI," Frontiers in Bioengineering and Biotechnology, vol. 8,DOI: 10.3389/fbioe.2020.534592, 2020.

[19] A. El Bilali, A. Taleb, M. Bahlaoui, Y. Brouziyne, "An integrated approach based on Gaussian noises-based data augmentation method and AdaBoost model to predict faecal coliforms in rivers with small dataset," Journal of Hydrology, vol. 29, article 126510, 2021.

[20] M. Monshi, J. Poon, V. Chung, F. Monshi, "CovidXrayNet: optimizing data augmentation and CNN hyperparameters for improved COVID-19 detection from CXR," Computers in Biology and Medicine, vol. 133, article 104375,DOI: 10.1016/j.compbiomed.2021.104375, 2021.

[21] C. Liu, H. Ding, X. Jiang, "Towards enhancing fine-grained details for image matting," 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 385-393, DOI: 10.1109/wacv48630.2021.00043, .

[22] X. Chen, J. Lai, "Salient points driven pedestrian group retrieval with fine-grained representation," Neurocomputing, vol. 423, pp. 255-263, DOI: 10.1016/j.neucom.2020.09.054, 2021.

[23] X. Chen, J. Lai, "Salient points driven pedestrian group retrieval with fine-grained representation," Neurocomputing, vol. 423, pp. 255-263, DOI: 10.1016/j.neucom.2020.09.054, 2021.

[24] W. Wu, J. Yu, "An improved bilinear pooling method for image-based action recognition," 2020 25th International Conference on Pattern Recognition (ICPR), pp. 8578-8583, DOI: 10.1109/icpr48806.2021.9413028, .

[25] X. Chen, X. Zheng, X. Lu, "Bidirectional interaction network for person re-identification," IEEE Transactions on Image Processing, vol. 30, pp. 1935-1948, DOI: 10.1109/TIP.2021.3049943, 2021.

[26] T. Pradhan, P. Kumar, S. Pal, "CLAVER: an integrated framework of convolutional layer, bidirectional LSTM with attention mechanism based scholarly venue recommendation," Information Sciences, vol. 559, pp. 212-235, DOI: 10.1016/j.ins.2020.12.024, 2021.

[27] S. Gao, Q. Han, D. Li, P. Peng, M. Cheng, P. Peng, "Representative batch normalization with feature calibration," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8669-8679, .

[28] F. Laakmann, P. Petersen, "Efficient approximation of solutions of parametric linear transport equations by ReLU DNNs," Advances in Computational Mathematics, vol. 47 no. 1, 2021.

[29] C. Ren, J. Dulay, G. Rolwes, D. Pauli, N. Shakoor, A. Stylianou, "Multi-resolution outlier pooling for sorghum classification," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2931-2939, .

[30] P. Staszewski, M. Jaworski, J. Cao, L. Rutkowski, "A new approach to descriptors generation for image retrieval by analyzing activations of deep neural network layers," IEEE Transactions on Neural Networks and Learning Systems,DOI: 10.1109/TNNLS.2021.3084633, 2021.

[31] H. Šimonová, B. Kucharczyková, V. Bílek, L. Malíková, P. Miarka, M. Lipowczan, "Mechanical fracture and fatigue characteristics of fine-grained composite based on sodium hydroxide-activated slag cured under high relative humidity," Applied Sciences, vol. 11 no. 1, 2021.

[32] X. Zhu, S. Ye, L. Zhao, Z. Dai, "Hybrid attention cascade network for facial expression recognition," Sensors, vol. 21 no. 6,DOI: 10.3390/s21062003, 2021.

[33] R. Ferdous, M. Arifeen, T. Eiko, S. Al Mamun, "Performance analysis of different loss function in face detection architectures," Proceedings of International Conference on Trends in Computational and Cognitive Engineering, pp. 659-669, .

[34] M. Shorfuzzaman, M. Hossain, "MetaCOVID: a Siamese neural network framework with contrastive loss for _n_ -shot diagnosis of COVID-19 patients," Pattern Recognition, vol. 113, article 107700,DOI: 10.1016/j.patcog.2020.107700, 2021.

[35] P. Negi, R. Marcus, A. Kipf, H. Mao, N. Tatbul, T. Kraska, M. Alizadeh, "Flow-Loss: learning cardinality estimates that matter," . 2021, http://arxiv.org/abs/2101.04964

[36] Z. Zhang, "Improved Adam optimizer for deep neural networks," 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS),DOI: 10.1109/iwqos.2018.8624183, .

[37] R. Ward, X. Wu, L. Bottou, "AdaGrad stepsizes: sharp convergence over nonconvex landscapes," International Conference on Machine Learning, pp. 6677-6686, .

[38] F. Zou, L. Shen, Z. Jie, W. Zhang, W. Liu, "A sufficient condition for convergences of Adam and RMSProp," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11127-11135, DOI: 10.1109/cvpr.2019.01138, .

Word count: 4788

Show less

Copyright © 2021 Mingyu Gao et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Translate

Wood defects are quickly identified from an optical image based on deep learning methodology, which effectively improves the wood utilization. The traditional neural network technique is unemployed for the wood defect detection of optical image used, which results from a long training time, low recognition accuracy, and nonautomatic extraction of defect image features. In this paper, a wood knot defect detection model (so-called BLNN) combined deep learning is reported. Two subnetworks composed of convolutional neural networks are trained by Pytorch. By using the feature extraction capabilities of the two subnetworks and combining the bilinear join operation, the fine-grained features of the image are obtained. The experimental results show that the accuracy has reached up 99.20%, and the training time is obviously reduced with the speed of defect detection about 0.0795 s/image. It indicates that BLNN has the ability to improve the accuracy of defect recognition and has a potential application in the detection of wood knot defects.

Details

Title

BLNN: Multiscale Feature Fusion-Based Bilinear Fine-Grained Convolutional Neural Network for Image Classification of Wood Knot Defects

Author

Gao, Mingyu¹; Wang, Fei²

; Song, Peng³

; Liu, Junyan²; Qi, DaWei¹

¹ College of Science, Northeast Forestry University, Harbin 150040, China
² School of Mechatronics Engineering, Harbin Institute of Technology, Harbin 150001, China; State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin 150001, China
³ School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin 150001, China

Editor

Hongjin Wang

Publication year

2021

Publication date

2021

Publisher

John Wiley & Sons, Inc.

ISSN

1687725X

e-ISSN

16877268

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2021/8109496

ProQuest document ID

2565926813

BLNN: Multiscale Feature Fusion-Based Bilinear Fine-Grained Convolutional Neural Network for Image Classification of Wood Knot Defects

Jump to:

Full text

Abstract

Details

Suggested sources