1. Introduction
One major problem in the agriculture field is that harvest performance could be damaged by many different plant diseases. Tomato (Solanum lycopersicum L.) is one of the most important and popular vegetable crops in the world [1]. Thus, the prevention of tomato plant diseases has attracted scientists, aiming to increase the harvest performance [2]. There are more than a dozen different diseases of tomato plants in practice, so it is essential to detect them as accurately and as early as possible to prevent and treat the disease [3,4]. There are different methods of detecting plant diseases for different treatment, including those that use artificial intelligence (AI) algorithms of SVM based on image futures or neural networks (NNs) [5,6,7]. It is obvious that AI techniques have been applied in many fields of agriculture to identify plant diseases such as apple, tomato, and rice; others have used deep learning networks, which will be applied in this research for classifying tomato leaf diseases [8,9,10,11]. In our research, one deep learning network will be employed for classifying tomato leaf diseases.
In recent years, deep learning (DL) has been applied for the classification of plant conditions in order to find diseases for early treatment [12,13,14,15,16]. Significantly, the CNN was employed to improve the identification performance from 91% and 98% of 13 leaf diseases [17]. Liu at el. proposed using the CNN with AlexNet for classifying four apple leaf types: mosaic, rust, brown spot, and Alternaria leaf spot [18]. With this CNN, the diseases on apple leaves were detected with high recognition performance, up to 97.62%. In another study, Lu et al. utilized a deep CNN model to detect rice diseases using 500 images of 10 different common conditions [19]. This model achieved an accuracy of 95.48%, which is much higher than using a traditional machine learning network.
Tomato leaf diseases have attracted many researchers and different algorithms have been proposed for recognizing and classifying tomato diseases [20,21,22]. The combination of three convolutional network families for recognizing tomato diseases was proposed [20]. In particular, these faster region-based convolutional neural network (Faster R-CNN), region-based fully convolutional network (R-FCN), and single shot multibox detector (SSD) were combined to create the meta-architectures with “deep feature extractors” and this increased the classification performance. One of the famous datasets is the Plant Village database, consisting of 16,010 images of 10 tomato leaf diseases, which is often used in many research articles [23]. In [24], the authors proposed a novel PCA–whale optimization-based deep neural network model for classifying tomato plant diseases. In particular, the PCA-whale optimization was applied to extract features of the leaf images before being fed into the deep learning network model for classification. This model represented not only an outstanding performance but also difficulties in some cases related to the lacking number of samples.
CNN models and a transfer learning method have been applied for classifying tomato leaf diseases [25,26,27,28] to create a pre-trained model for increased prediction. In particular, the authors proposed two models using transfer learning and feature extraction in classifying tomato plant diseases and the obtained result was an accuracy of 90% [29]. Moreover, Rangarajan et al. [30] developed a pre-trained deep-learning algorithm with two AlexNet and VGG-16 models, in which the transfer learning was employed for classifying six tomato crop diseases and one healthy class in image sets obtained from the Plant Village database. This research showed high accuracies of 97.29% using VGG-16 and 97.49% using AlexNet, respectively. In another study [31], a CNN with the transfer learning approach was employed to recognize nine leaf disease sets, in which the automatic extraction of features by directly processing the raw images was performed. This method of the CNN with the transfer learning achieved a high accuracy of 99.18%.
Detection of leaf diseases can be performed by extracting features of plant leaf images or feature extraction and DL-CNNs are combined [32]. Image-processing methods such as enhancing, filtering, segmentation, thresholding, or feature extracting were applied with the back propagation network for detecting or recognizing leaf diseases and the classification result was very good [33,34,35,36,37,38,39]. In agriculture, plant diseases can significantly reduce the quality and quantity of products. In tropical regions, apples are widely grown and also attacked by pathogens or fungi such as bacterial, algal, and nematodes [40]. In this study, the authors proposed image segmentation based on color differences to separate the apple leaf disease areas. In particular, color (RGB, HSV) histogram, and texture (LBP) features are applied to extract feature vectors with rich information. After the feature extraction of leaf disease areas, advanced machine learning algorithms (fine KNN, SVM, bagged tree, complex, and others) are applied for recognition. The classification accuracy using bagged tree is 99%, with apple leaf disease. This is obvious proof that the reasonable image processing for the apple leaf image set combined with machine learning is effective.
Segmentation plays an essential role in detecting plant disease through different leaf conditions. In particular, in the proposed segmentation method using hybrid sub-types [41], the whole color leaf image was firstly divided into a number of nearly homogenous super-pixels creating super-pixel clusters, which could be helpful to clusters for image segmentation to increase convergence speed. The pixel lesions were quickly and accurately segmented using the expectation-maximization (EM) algorithm from each super-pixel. In another work, Singh et al. applied image segmentation and soft computing techniques to automatically detect and classify plant leaf diseases such as banana, beans, jack-fruit, lemon, mango, potato, tomato, and sapota [42]. This work achieved an average accuracy of 97.6% and this illustrates the effective proposed method.
One possible solution for segmenting leaf diseases was developed by Zhang et al. [43]. In this research, internet of things (IoT) was employed and the combination of clusters of super-pixels, K-means, and pyramid of histograms of orientation gradients (PHOG) was also proposed. In particular, the color images were split into many small super-pixels using the clustering method. Therefore, the K-mean algorithm was to segment the lesion images from each super-pixel. Finally, three color components of each segmented part were combined with its gray-scale image to create four PHOG descriptors as one vector. This method showed an accuracy of 85.64% for the apple leaf image sets, including Alternaria, mosaic, and rust diseases. The group of Storey used the mask R-CNN algorithm for segmenting rust diseases on apple leaves; then, fractional masks on a subset of the Plant Pathology Challenge 2020 database were applied to produce the classification accuracy of 80.5% [44].
This article consists of four sections. Section 2 presents the structure of the VGG-19 model with transfer learning combined with image segmentation for classifying ten tomato leaf diseases, consisting of nine tomato leaf disease types and one healthy type. These images are segmented to extract the disease leaf areas and the same black background using HSV color space. In Section 3, the description of parameters adjusted to improve the model performance and reduce the training time of the VGG-19 is given, and the experimental results using the proposed method are discussed. Finally, Section 4 presents the conclusions about this research.
2. Materials and Methods
2.1. Plant Village Tomato Leaf Image Datasets
The tomato leaf image dataset (Plant Village) has 16,010 images with ten types [45]. Table 1 shows 9 types of tomato bacterial spot, tomato Septoria leaf spot, Mosaic virus, leaf mold, target spot, early blight, yellow leaf curl virus, tomato late blight, two-spotted spider mites, and one healthy image type. All images have different sizes and need to be resized into 224 × 224 or 256 × 256 for the input size of VGG-19. The number of each image type is imbalanced; for example, the yellow leaf curl virus disease has 3208 images, and the Mosaic virus only has 373 images. Therefore, this can be one of the reasons for decreasing the classification accuracy when applying the DCNN structure.
2.2. Proposed Model for Classifying Tomato Leaf Diseases
As shown in Figure 1, the proposed model consists of a tomato leaf dataset, image segmentation, VGG-19 with transfer learning, and an evaluation block. The resized images are used to extract the leaf regions and the same black background from the HSV color space. In addition, the image features are applied for pre-trained layers through the transfer learning method. Finally, the model is evaluated by a confusion matrix.
This research applies many different DL networks, such as VGG-19, Alexnet, GooLeNet, and ResNet50, to get the highest classification. Some network layers are freezers, and some layers are re-trained for comparison. Additionally, various parameters, including batch size, epoch, or learning rate, are fine-tuned to select the best result.
2.3. Tomato Leaf Image Segmentation Using the HSV Color Space
HSV is more meaningfully related to the psychological perception of color than RGB. Therefore, an HSV color space can separate color information from intensity or lighting. In addition, a histogram can be constructed for choosing a thresholding rule using only saturation (S) or hue (H). In practice, it is just a nice improvement that even by singling out only the hue, it still has a very meaningful representation of the base color that is much better when compared with RGB. As a result, there is more robust color thresholding over simpler parameters using the HSV color space.
For segmentation to extract leaf regions and the black backgrounds of all images, HSV of all leaf images can be uniformed perceptually, and all components of images can be quantized with the same precision. In particular, the HSV is a three-dimensional cartesian coordinate system, and its brightness values (V) can vary from 0 to 1. The hue (H) means that its color can range from 0 to 360 degrees, in which 0 or 360 degrees is red, 60 degrees of 60, 120, 180, 240, and 300 correspond to the colors of yellow, green, cyan, blue, and magenta, respectively. The saturation (S) defines the white color and is mixed with H to produce different colors represented by the percentage of the range from 0 to 1. The 1 means a pure color such as red, green, or blue. One image in the RGB color space can be transformed to the HSV using the following equations [46]:
(1)
(2)
(3)
in which, , , , , , and .For the segmentation to extract leaf regions and to create the black backgrounds of all original RGB images, the algorithm is described as follows:
Step 1: The RGB image is converted into the image with the HSV color space. The HSV components (three sub-images) can easily distinguish colors related to the type of color, the shade of color, the purity of color, or the brightness of color.
Step 2: Histograms for all three HSV components are plotted to choose their lower and upper threshold values.
Step 3: Masking is to segment and convert the HSV images to binary images based on the histogram HSV thresholds before extracting the leaf region. Therefore, we can fill holes in one binary image to create a leaf mask using the morphological operation.
Step 4: The white mask can map to the RGB image for collecting the original leaf region and the black background.
2.4. VGG-19 Model
The DL networks can be applied for image classification in many fields based on large datasets with around 60 million parameters and 650,000 neurons [47]. In practice, the network architecture can have five convolutional layers and three fully connected layers with different roles. There are two first convolution layers (standard layer and max-pooling layer), the 3rd and 4th convolution layers (directly connected), the last convolution layer (max-pooling layer), and the output layer (softmax layer). In addition, some networks have particular architectures for unique applications. For instance, GoogleNet is a network with about 7 million parameters, 9 inception modules, 4 convolutional layers, 4 max-pooling layers, 3 average pooling layers, 5 fully connected layers, and 3 softmax layers [48]. All convolutional and dropout layers use the ReLU (activation function) with a parameter reduction ratio of 70% applied to all fully connected layers. Additionally, ResNet is similar to VGG-19 and has been adapted many times to produce ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152 [49]. Herein, we apply VGG-19 to train the tomato leaf image dataset.
Basically, VGG has an architecture of a CNN network, and VGG-19 is one of the VGG-based architectures [50]. The VGG-19 is a deep-learning neural network with 19 connection layers, including 16 convolution layers and 3 fully connected layers. The convolution layers will extract features of the input images, and the fully connected layers will classify the leaf images for those features. In addition, the max-pooling layers will reduce the features and avoid overfitting, as described in Figure 2.
The output of each convolutional layer is represented by the following expression:
(4)
in which × is the convolutional function which describes the connection between the weights of the ith and jth features in the (l−1)th, lth layers, bj is the bias value, and φ is the activation function.2.5. Transfer Learning
Transfer learning is often used in DL networks from the trained object to re-train the other objects. There are four types of transfer learning: case-based, features-based, parameter-based, and relationship-based. Obviously, choosing the trained parameters for the best classification system is a big challenge. Here, a suitable network architecture needs to be addressed along with the network parameters, and these values need to be estimated for the new input data. Then, the new network needs to be fine-tuned to improve performance. In this paper, the parameter-based transfer learning method was applied for classifying tomato leaf diseases. In particular, the VGG-19 network will freeze the convolution layers and re-train the fully connected layers to enhance the classification. Furthermore, we also adjust the batch size, epoch, and learning rate to choose the best network.
2.6. Evaluation of Classification System
Using a confusion matrix (Table 2), we can evaluate the performance when classifying the 9 disease types and 1 healthy type of tomato leaf. Some parameters, such as accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and F1 score (F1S), are also considered in the VGG-19 model.
In the confusion matrix with the output at the ith layer, the value is the number of correctly recognized images, and describes the number of images at the ith layer but recognized to the jth layer, (i ≠ j). Furthermore, Pi represents the total number of images at the ith layer after recognition, Ni describes the total number of images at the ith layer in the original label, and NT is the total number of images in the testing dataset. Therefore, the Ni, Pi, and NT values are calculated using the following equations:
(5)
(6)
(7)
From the values of Ni, Pi, and NT, the ACC, SEN, SPE, PPV, and F1S values are determined using the following formulas:
(8)
(9)
(10)
(11)
(12)
The model can be evaluated through Equations (8)–(12), in which the classifier is highly effective when the ACC, SEN, SPE, PPV, and F1S are large.
3. Results
3.1. Image Segmentation Result
In this paper, the initial tomato leaf images were resized to the size of 224 × 224 as shown in Figure 3. Therefore, the segmentation method of the tomato leaf images with the same size was proposed for extracting the leaf regions and the same black backgrounds before being used for training the VGG-19. In particular, the HSV color space and histogram thresholds were applied for creating binary images before segmentation to produce the leaf regions and the same background.
Figure 4 shows the RGB image and the image in the HSV color space, in which the HSV image has three images with the H, S, and V components. The images with the HSV components were transformed to produce the corresponding histograms, called sub-images. Before plotting these histograms, we normalized the values of the components in the range from 0 to 1. The threshold values for H, S, and V were calculated based on the gray levels of the histograms as shown in Figure 5. From the histograms, we observed that the lower and upper threshold values for H are 0.18 and 0.5, due to the green colors of tomato leaves, and similarly, the values are 0.1 and 1.0 for S, and 0.0 and 0.8 for V.
The morphological reconstruction of one image was performed based on the mask image convoluted to the RGB input image (Figure 6a) to produce the binary image (Figure 6b). In addition, to produce the image with the leaf region and the black background as shown in Figure 6c, the binary image was multiplied with the RGB image.
Figure 7 shows the images after segmentation with the original leaf regions and the same black backgrounds, in which the leaves can retain the disease areas. Therefore, the dataset can speed up the training time, reduce many computations inside the neural network, and increase the accuracy.
Figure 8 describes five field tomato leaf images with the different diseases which have the different backgrounds (Figure 8a–e) and the segmented images to produce the leaf regions and the black backgrounds (Figure 8f–j). With the segmented field images using the HSV color space, the obtained results are similar to those of the images from the Plant Village database. In this research, we performed the segmentation of images based on the histogram thresholding values of the leaf regions, while the different backgrounds of the images are not relative to the segmentation results.
To evaluate the performance of the HSV technique, we calculated a metric or structural similarity index (SSIM) between the HSV-segmented images and the manually segmented images [52]. The SSIM range was installed to be from 0 to 1. The maximum value of 1 indicates that the structurally segmented image is similar to the initial one, and inversely, the minimum of 0 is no structure, as described in Table 3.
(13)
where x is the original image and y denotes the segmented image; μx and μy denote the mean intensity for the reference image x and image y, respectively. In addition, and describe the standard deviation of the original and segmented images, and refers to the covariance between the original and segmented images. C1 = (K1L)2, and C2 = (K2L)2 are constants, in which K1 and K2 should be less than 1 and L is the number of gray levels of a grayscale image or an image in three channels, such as the RGB image.3.2. Classification Performance Using the VGG-19 with Transfer Learning
The VGG-19 is a deep and wide structure in which the number of computational parameters is well-optimized. In particular, the parameters were configured for training the network, including epochs (300), hidden layer active function (Tansig), output active function (Softmax), initial learning rate (0.00001), and batch size (60).
In this research, the division of 16,010 tomato leaf images (100%) in the Plant Village dataset is performed as follows: 80% of the image set for training and validation, and 20% of the image set for testing. In addition, the images for training and validation are divided into the training (80%) and the validation (20%).
Figure 9 demonstrates the confusion matrix of the model; where 1 is bacterial spot disease, 2 is early blight disease, 3 is late blight disease, 4 is leaf mold disease, 5 is Septoria leaf spot disease, 6 is two-spotted spider mites, 7 is target spot disease, 8 is yellow leaf curl virus disease, 9 is Mosaic virus disease, and 10 is the healthy leaf, respectively. We observed that the SEN value of most leaf diseases is high, at over 99%, except the SEN value of the mosaic virus type, at 97.4%. This result can be because the mosaic virus dataset is only 373 images, compared with the other datasets from 1000 to 3000 images. To improve, we can apply the up-sampling technique to increase the number of datasets or the balanced class weight method to enhance the accuracy of the mosaic type. Furthermore, the PPV value of most diseases is perfective at 100%. So, the model has an average accuracy of 99.71%. This value demonstrates that the proposed model is very effective for classifying the tomato leaf diseases.
3.3. Effect of Learning Rate Parameter on Classification Performance
We fine-tuned the initial learning rate (LR) value to investigate the suitable training parameters and then evaluated the output performance. The result of each value is shown in Table 4. In particular, the obtained classification accuracy is 99.72% for LR at 0.00001, 99.34% for LR at 0.0001, and 99.02% for LR at 0.001, respectively. The model performance decreases when increasing the LR value because the system does not reach the best convergence point. Therefore, the best LR is 0.00001 for the classifier of the tomato leaf diseases.
3.4. Effect of Epoch Parameter on Classification Performance
Epochs describe the number of training times of the neural network until the training is stopped. The model will not match the training data (under-fitting) when the epoch is too small, and the model will be over-fitting when this value is too large. In both cases, the classification result is not good. However, we cannot calculate the suitable epoch and we must choose this value based on the model and dataset. Table 5 shows the chosen epoch values from 50 to 400. We observed that the accuracy is highest at 99.72% at 300 epochs. Furthermore, the accuracy is 98.50% at 50 epochs because the number of trained times is not enough (under-fitting). Inversely, the accuracy is 99.66% at 400 epochs because the number of trained times is too much (over-fitting). In addition, Figure 10 shows that the epoch is chosen to be 300, the loss function value gradually approaches the minimum value. From those results, choosing 300 epochs represents the best value for achieving optimal performance.
3.5. Comparison of Performance Results with Difference Models
Different models were applied to investigate tomato leaf disease classification, such as AlexNet, GoolgeNet, and ResNet50. Most network architectures gave relatively good results, as shown in Figure 11. In this experiment, we used the tomato leaf image dataset divided into the sub-datasets for training and testing, as shown in Section 3.2. With the same parameters, the AlexNet has an accuracy of 99.16%, and the ResNet has an accuracy of 99.19%. The GoogleNet shows an accuracy slightly higher than the two above models, at 99.38%. Meanwhile, our model achieved the highest accuracy at 99.72% with VGG-19.
3.6. Comparison of Time for Training and Testing between Segmented and Non-Segmented Images
In this paper, we also investigate the leaf disease classification performance and time between non-segmented and segmented images. In particular, Table 6 presents the results of the leaf disease classification performance as well as time for training and testing between the two sets of non-segmented and segmented leaf disease images using the HSV color space method. The performance of leaf disease classification when using the segmented method is higher than that of using the non-segmented one. In particular, the classification accuracy of leaf diseases in the two cases are 99.72% and 99.63%, respectively. It is obvious that time for training and testing of the leaf disease classifier in the case of the segmented images is faster than that in the non-segmented ones; specifically, it takes the network using the segmented images 2.75 × 105 s (training) and 29.30 s (testing), while times of the network using the non-segmented images are 2.98 × 105 (training) and 35.27 (testing). The reason for this achievement of better performance and time is that the network using the segmented images only focuses on the feature region of the leaf image and the same background of the leaf images is black. Meanwhile, for the original image, the network is trained for both the leaf image region and background.
4. Discussion
The experimental results of this article were compared with previous works, in which the Plant Village dataset of tomato leaf diseases was used in all deep-learning network models. Indicators related to pre-processing, transfer learning, model, classes of images, and other parameters are described in Table 7.
Without transfer learning, the features can be extracted from convolutional layers in the network or extracted by traditional feature extraction methods. Gadekallu et al. [24] proposed the principal component analysis–whale optimization algorithm (PCA-WOA) method for extracting features and then used it in a multilayer perceptron (MLP) network. This network has one input layer, two hidden layers, and one output layer for classifying ten different tomato leaf diseases. The accuracy of the obtained classifier was 94.00%. Agarwal et al. [28] and Trivedi et al. [54] developed convolutional layers in the CNN architecture for feature extraction and fully connected layers for disease classification. These studies applied pre-processing and transforming images to achieve model accuracies of 98.40% and 98.49%, respectively. In addition, the RGB images were pre-processed to produce the same size and grey images. This research only focused on applying the CNN, in which features of colors, textures, and edges were extracted for training. With transfer learning, Maeda-Gutiérrez et al. [53] and Brahimi et al. [31] proposed the GoogleNet architectures for classifying ten disease types. Their model accuracies were 99.39% and 99.35%, respectively.
Compared with the above studies, we optimized the tomato leaf dataset by segmenting to extract the leaf region and the black background. It is obvious that the segmentation of the leaf images reduced the training time and increased the classification performance. In addition, compared with previous research [24,28,31,53,54] without transfer learning, our VGG-19 model applied the transfer learning method with a learning rate of 0.00001, and epochs of 300; the classification results are outstanding, particularly with PPV at 99.49%, SEN at 99.69%, F1S at 99.59%, and ACC at 99.72%, respectively.
Finally, in Table 7 we find that the previous studies basically have the same proposed methods such as using transfer learning, ten classes, and a little bit of difference of the network model, GoogleNet [31,53] and VGG-19 (our model). The large difference is that our research applied HSV color space for image segmentation and the two studies [31,53] did not apply preprocessing for the image sets. The result is that the classification accuracy in our proposed method is a little bit higher, particularly the ACC value is 99.72% compared with 99.35% and 99.39%, respectively.
5. Conclusions
This paper proposed a classification model using the tomato leaf images segmented for training the VGG-19 with the transfer learning method. This classification model used nine disease types and one healthy type obtained from the Plant Village database which were segmented to extract the original leaf regions and the black backgrounds using the HSV. In addition, the learning rate and epochs were selected in the VGG-19 with the transfer learning to have the best classification network. In particular, the classification accuracy is 99.72%, higher than that of previously proposed works which had the highest ACC at 99.35% and lowest ACC at 94%. Moreover, the training time of the VGG-19 with the segmented images (2.75 s) and (29.30 s) is faster compared with that of the VGG-19 without the image segmentation (2.98 s and 35.27 s). The experimental results using the segmented leaf images and the best choice of the network parameters demonstrated the effectiveness of the proposed model. Moreover, the model with the leaf image segmentation and the VGG-19 architecture with transfer learning can be developed to be applied for classifying other leaf image datasets.
Conceptualization, methodology, simulation, and writing original draft, B.-V.N. and T.-N.N.; methodology, supervision, validation and writing—review and editing, T.-H.N. All authors have read and agreed to the published version of the manuscript.
The data presented in this study are openly available at
The authors declare no conflict of interest.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure 1. Proposed model for classifying tomato leaf disease and classification accuracy evaluation.
Figure 3. Ten samples of tomato leaf disease and healthy images: (a) Bacterial spot disease; (b) Early blight disease; (c) Late blight disease; (d) Leaf mold disease; (e) Septoria leaf spot disease; (f) Two-spotted spider mites; (g) Target spot disease; (h) Yellow leaf curl virus disease; (i) Mosaic virus disease; (j) Healthy.
Figure 4. Representation of the processed image: (a) the RGB image; (b) the HSV image converted from the RGB image.
Figure 5. Input image in the HSV color space and the histograms of its components: (a) Hue component image; (b) Saturation component image; (c) Value component image; (d) Histogram of Hue component image; (e) Histogram of Saturation component image; (f) Histogram of Value component image.
Figure 6. The results of the segmented image: (a) the RGB input image; (b) the binary image processed using one mask; (c) the image with the leaf region and the black background.
Figure 7. Ten segmented leaf images including nine leaf disease images and one healthy image: (a) Bacterial spot disease; (b) Early blight disease; (c) Late blight disease; (d) Leaf mold disease; (e) Septoria leaf spot disease; (f) Two-spotted spider mites; (g) Target spot disease; (h) Yellow leaf curl virus dis-ease; (i) Mosaic virus disease; (j) Healthy leaf.
Figure 8. Five segmented field tomato leaf images: (a) The field images of bacterial spot disease; (b) Early blight disease; (c) Late blight disease; (d) Septoria leaf spot disease; (e) Healthy leaf; the segmented field images of (f) Bacterial spot disease; (g) Early blight disease; (h) Late blight disease; (i) Septoria leaf spot disease; (j) Healthy leaf.
Figure 8. Five segmented field tomato leaf images: (a) The field images of bacterial spot disease; (b) Early blight disease; (c) Late blight disease; (d) Septoria leaf spot disease; (e) Healthy leaf; the segmented field images of (f) Bacterial spot disease; (g) Early blight disease; (h) Late blight disease; (i) Septoria leaf spot disease; (j) Healthy leaf.
Figure 9. Classification evaluation results of nine leaf diseases and one healthy leaf: (1) Bacterial spot disease, (2) Early blight disease, (3) Late blight disease, (4) Leaf mold disease, (5) Septoria leaf spot disease, (6) Two-spotted spider mites, (7) Target spot disease, (8) Yellow leaf curl virus disease, (9) Mosaic virus disease, (10) Healthy leaf.
Figure 10. The representation of the loss function values during training the network using 300 epochs.
Figure 11. Classification performance comparison of tomato leaf diseases using four different deep-learning network architectures.
The total tomato leaf images used in this research.
No. | Class of Tomato Leaf Images | Images |
---|---|---|
1 | Tomato bacterial spot disease | 2127 |
2 | Tomato Septoria leaf spot disease | 1771 |
3 | Mosaic virus disease | 373 |
4 | Leaf mold disease | 952 |
5 | Target spot disease | 1404 |
6 | Early blight disease | 1000 |
7 | Yellow leaf curl virus disease | 3208 |
8 | Tomato late blight disease | 1908 |
9 | Two-spotted spider mites | 1676 |
10 | Healthy leaf | 1591 |
Total of tomato leaf images | 16,010 |
Representation of the confusion matrix with many layers [
Predicted Classes | |||
---|---|---|---|
|
|||
True classes |
|
|
|
|
|
Description of the SSIM values between 10 segmented leaf images and the original leaf ones.
No. | Image Name | SSIM |
1 | Bacterial spot disease | 0.914 |
2 | Early blight disease | 0.905 |
3 | Late blight disease | 0.929 |
4 | Leaf mold disease | 0.908 |
5 | Septoria leaf spot disease | 0.912 |
6 | Two-spotted spider mites | 0.946 |
7 | Target spot disease | 0.924 |
8 | Yellow leaf curl virus disease | 0.917 |
9 | Mosaic virus disease | 0.922 |
10 | Healthy leaf | 0.924 |
Classification evaluation results with the confusion matrix parameters and three different learning rates (in unit %).
Learning Rate | 0.00001 | 0.0001 | 0.001 |
---|---|---|---|
ACC | 99.72 | 99.34 | 99.02 |
SEN | 99.69 | 99.36 | 98.48 |
SPE | 99.90 | 99.77 | 99.66 |
PPV | 99.49 | 99.23 | 98.75 |
F1S | 99.59 | 99.29 | 99.00 |
Classification evaluation results with confusion matrix parameters and five different epochs (in unit %).
Epochs | 50 | 100 | 200 | 300 | 400 |
---|---|---|---|---|---|
ACC | 98.50 | 99.13 | 99.53 | 99.72 | 99.66 |
SEN | 98.61 | 99.18 | 99.54 | 99.69 | 99.59 |
SPE | 99.32 | 99.66 | 99.83 | 99.90 | 99.89 |
PPV | 98.16 | 98.75 | 99.38 | 99.49 | 99.43 |
F1S | 98.39 | 98.97 | 99.46 | 99.59 | 99.51 |
Comparison of time for training and testing between non-segmented images and segmented images.
Preprocessing | ACC |
SEN |
SPE |
PPV |
F1S |
Training Time |
Testing Time |
---|---|---|---|---|---|---|---|
Segmented images | 99.72 | 99.69 | 99.90 | 99.49 | 99.59 | 2.75 × 105 | 29.30 |
Non-segmented images | 99.63 | 99.58 | 99.82 | 99.37 | 99.47 | 2.98 × 105 | 35.27 |
Tomato leaf disease comparison between previous methods and results.
Work | Model | Transfer Learning | Classes | Preprocessing | ACC |
---|---|---|---|---|---|
Maeda-Gutiérrez et al. [ |
GoogleNet | Yes | 10 | No | 99.39 % |
Brahimi et al. [ |
GoogleNet | Yes | 10 | No | 99.35 % |
Agarwal et al. [ |
CNN model | No | 10 | No | 98.40 % |
Gadekallu et al. [ |
MLP | No | 10 | PCA-WOA | 94.00 % |
Trivedi et al. [ |
CNN model | No | 10 | Transformed into grey images | 98.49% |
Our proposed model | VGG-19 | Yes | 10 | HSV color space for image segmentation | 99.72 % |
References
1. Mukhtar, T.; Rehman, S.U.; Smith, D.; Sultan, T.; Seleiman, M.F.; Alsadon, A.A.; Amna,; Ali, S.; Chaudhary, H.J.; Solieman, T.H.I. et al. Mitigation of Heat Stress in Solanum lycopersicum L. by ACC-deaminase and Exopolysaccharide Producing Bacillus cereus: Effects on Biochemical Profiling. Sustainability; 2020; 12, 2159. [DOI: https://dx.doi.org/10.3390/su12062159]
2. Duffy, S.; Holmes, E.C. Multiple introductions of the Old World begomovirus Tomato yellow leaf curl virus into the New World. Appl. Environ. Microbiol.; 2007; 73, pp. 7114-7117. [DOI: https://dx.doi.org/10.1128/AEM.01150-07] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/17827315]
3. Moriones, E.; Praveen, S.; Chakraborty, S. Tomato Leaf Curl New Delhi Virus: An Emerging Virus Complex Threatening Vegetable and Fiber Crops. Viruses; 2017; 9, 264. [DOI: https://dx.doi.org/10.3390/v9100264]
4. Cañizares, M.C.; Rosas-Díaz, T.; Rodríguez-Negrete, E.; Hogenhout, S.A.; Bedford, I.D.; Bejarano, E.R.; Navas-Castillo, J.; Moriones, E. Arabidopsis thaliana, an experimental host for tomato yellow leaf curl disease-associated begomoviruses by agroinoculation and whitefly transmission. Plant Pathol.; 2015; 64, pp. 265-271. [DOI: https://dx.doi.org/10.1111/ppa.12270]
5. Yin, C.; Zeng, T.; Zhang, H.; Fu, W.; Wang, L.; Yao, S. Maize Small Leaf Spot Classification Based on Improved Deep Convolutional Neural Networks with a Multi-Scale Attention Mechanism. Agronomy; 2022; 12, 906. [DOI: https://dx.doi.org/10.3390/agronomy12040906]
6. Bahrami, H.; Homayouni, S.; Safari, A.; Mirzaei, S.; Mahdianpari, M.; Reisi-Gahrouei, O. Deep Learning-Based Estimation of Crop Biophysical Parameters Using Multi-Source and Multi-Temporal Remote Sensing Observations. Agronomy; 2021; 11, 1363. [DOI: https://dx.doi.org/10.3390/agronomy11071363]
7. Hassan, S.M.; Jasinski, M.; Leonowicz, Z.; Jasinska, E.; Maji, A.K. Plant Disease Identification Using Shallow Convolutional Neural Network. Agronomy; 2021; 11, 2388. [DOI: https://dx.doi.org/10.3390/agronomy11122388]
8. Suman, T.; Dhruvakumar, T. Classification of Paddy Leaf Diseases Using Shape and Color Features. Int. J. Electr. Electron. Eng.; 2015; 7, pp. 239-250.
9. Sanyal, P.; Patel, S.C. Pattern recognition method to detect two diseases in rice plants. Imaging Sci. J.; 2008; 56, pp. 319-325. [DOI: https://dx.doi.org/10.1179/174313108X319397]
10. Suresh, M.V.; Gopinath, D.; Hemavarthini, M.; Jayanthan, K.; Krishnan, M. Plant Disease Detection using Image Processing. Int. J. Eng. Res. Technol. (IJERT); 2020; 9, pp. 78-82. [DOI: https://dx.doi.org/10.17577/IJERTV9IS030114]
11. Zhou, C.; Zhou, S.; Xing, J.; Song, J. Tomato Leaf Disease Identification by Restructured Deep Residual Dense Network. IEEE Access; 2021; 9, pp. 28822-28831. [DOI: https://dx.doi.org/10.1109/ACCESS.2021.3058947]
12. Liu, J.; Wang, X. Early recognition of tomato gray leaf spot disease based on MobileNetv2-YOLOv3 model. Plant Methods; 2020; 16, pp. 83-99. [DOI: https://dx.doi.org/10.1186/s13007-020-00624-2] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32523613]
13. Ahmed, S.; Hasan, M.B.; Ahmed, T.; Sony, R.K.; Kabir, M.H. Less is More: Lighter and Faster Deep Neural Architecture for Tomato Leaf Disease Classification. IEEE Access; 2022; 10, pp. 68868-68884. [DOI: https://dx.doi.org/10.1109/ACCESS.2022.3187203]
14. Tan, L.; Lu, J.; Jiang, H. Tomato Leaf Diseases Classification Based on Leaf Images: A Comparison between Classical Machine Learning and Deep Learning Methods. AgriEngineering; 2021; 3, pp. 542-558. [DOI: https://dx.doi.org/10.3390/agriengineering3030035]
15. Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using Deep Learning for Image-Based Plant Disease Detection. Front. Plant Sci.; 2016; 7, 1419. [DOI: https://dx.doi.org/10.3389/fpls.2016.01419]
16. Wu, Q.; Chen, Y.; Meng, J. DCGAN-Based Data Augmentation for Tomato Leaf Disease Identification. IEEE Access; 2020; 8, pp. 98716-98728. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.2997001]
17. Sladojevic, S.; Arsenovic, M.; Anderla, A.; Culibrk, D.; Stefanovic, D. Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image Classification. Comput. Intell. Neurosci.; 2016; 2016, pp. 3289801-3289813. [DOI: https://dx.doi.org/10.1155/2016/3289801]
18. Liu, B.; Zhang, Y.; He, D.; Li, Y. Identification of Apple Leaf Diseases Based on Deep Convolutional Neural Networks. Symmetry; 2018; 10, 11. [DOI: https://dx.doi.org/10.3390/sym10010011]
19. Lu, Y.; Yi, S.; Zeng, N.; Liu, Y.; Zhang, Y. Identification of rice diseases using deep convolutional neural networks. Neurocomputing; 2017; 267, pp. 378-384. [DOI: https://dx.doi.org/10.1016/j.neucom.2017.06.023]
20. Fuentes, A.; Yoon, S.; Kim, S.C.; Park, D.S. A Robust Deep-Learning-Based Detector for Real-Time Tomato Plant Diseases and Pests Recognition. Sensors; 2017; 17, 2022. [DOI: https://dx.doi.org/10.3390/s17092022]
21. Saranya, S.M.; Rajalaxmi, R.R.; Prabavathi, R.; Suganya, T.; Mohanapriya, S.; Tamilselvi, T. Deep Learning Techniques in Tomato Plant—A Review. J. Phys. Conf. Ser.; 2021; 1767, pp. 012010-012024. [DOI: https://dx.doi.org/10.1088/1742-6596/1767/1/012010]
22. Yang, G.; Chen, G.; He, Y.; Yan, Z.; Guo, Y.; Ding, J. Self-Supervised Collaborative Multi-Network for Fine-Grained Visual Categorization of Tomato Diseases. IEEE Access; 2020; 8, pp. 211912-211923. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3039345]
23. Kaggle: Dataset of diseased plant leaf images and corresponding labels. Tomato Leaf Disease. 2019; Available online: https://www.kaggle.com/emmarex/plantdisease (accessed on 12 July 2022).
24. Gadekallu, T.R.; Rajput, D.S.; Reddy, M.P.K.; Lakshmanna, K.; Bhattacharya, S.; Singh, S.; Jolfaei, A.; Alazab, M. A novel PCA–whale optimization-based deep neural network model for classification of tomato plant diseases using GPU. J. Real-Time Image Process.; 2021; 18, pp. 1383-1396. [DOI: https://dx.doi.org/10.1007/s11554-020-00987-8]
25. Wen, J.; Shi, Y.; Zhou, X.; Xue, Y. Crop Disease Classification on Inadequate Low-Resolution Target Images. Sensors; 2020; 20, 4601. [DOI: https://dx.doi.org/10.3390/s20164601] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32824352]
26. Aravind, K.R.; Raja, P. Automated disease classification in (Selected) agricultural crops using transfer learning. Automatika; 2020; 61, pp. 260-272. [DOI: https://dx.doi.org/10.1080/00051144.2020.1728911]
27. Moyazzoma, R.; Hossain, M.A.A.; Anuz, M.H.; Sattar, A. Transfer Learning Approach for Plant Leaf Disease Detection Using CNN with Pre-Trained Feature Extraction Method Mobilnetv2. Proceedings of the 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST); DHAKA, Bangladesh, 5–7 January 2021; pp. 526-529.
28. Agarwal, M.; Gupta, S.K.; Biswas, K.K. Development of Efficient CNN model for Tomato crop disease identification. Sustain. Comput. Inform. Syst.; 2020; 28, pp. 100407-100432. [DOI: https://dx.doi.org/10.1016/j.suscom.2020.100407]
29. Verma, S.; Chug, A.; Singh, A.P. Application of convolutional neural networks for evaluation of disease severity in tomato plant. J. Discret. Math. Sci. Cryptogr.; 2020; 23, pp. 273-282. [DOI: https://dx.doi.org/10.1080/09720529.2020.1721890]
30. Rangarajan, A.K.; Purushothaman, R.; Ramesh, A. Tomato crop disease classification using pre-trained deep learning algorithm. Procedia Comput. Sci.; 2018; 133, pp. 1040-1047. [DOI: https://dx.doi.org/10.1016/j.procs.2018.07.070]
31. Brahimi, M.; Boukhalfa, K.; Moussaoui, A. Deep Learning for Tomato Diseases: Classification and Symptoms Visualization. Appl. Artif. Intell.; 2017; 31, pp. 299-315. [DOI: https://dx.doi.org/10.1080/08839514.2017.1315516]
32. Kaur, R. A Brief Review on Plant Disease Detection using in Image Processing. Int. J. Comput. Sci. Mob. Comput.; 2017; 6, pp. 101-106.
33. Kumar, S.; Kaur, R. Plant Disease Detection using Image Processing—A Review. Int. J. Comput. Appl.; 2015; 124, pp. 6-9. [DOI: https://dx.doi.org/10.5120/ijca2015905789]
34. Halder, M.; Sarkar, A.; Bahar, H. Plant Disease Detection By Image Processing: A Literature Review. J. Food Sci. Technol.; 2019; 3, pp. 534-538. [DOI: https://dx.doi.org/10.25177/JFST.3.6.6]
35. Jagtap, S.B.; Hambarde, S.M. Agricultural Plant Leaf Disease Detection and Diagnosis Using Image Processing Based on Morphological Feature Extraction. IOSR J. VLSI Signal Process.; 2014; 4, pp. 24-30. [DOI: https://dx.doi.org/10.9790/4200-04512430]
36. Kamlapurkar, S.R. Detection of Plant Leaf Disease Using Image Processing Approach. Int. J. Sci. Res. Publ.; 2016; 6, pp. 73-76.
37. Lu, J.; Ehsani, R.; Shi, Y.; de Castro, A.I.; Wang, S. Detection of multi-tomato leaf diseases (late blight, target and bacterial spots) in different stages by using a spectral-based sensor. Sci. Rep.; 2018; 8, pp. 2793-2804. [DOI: https://dx.doi.org/10.1038/s41598-018-21191-6]
38. Hlaing, C.S.; Zaw, S.M.M. Tomato Plant Diseases Classification Using Statistical Texture Feature and Color Feature. Proceedings of the 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS); Singapore, 6–8 June 2018; pp. 439-444.
39. Barbedo, J.G.A. Digital image processing techniques for detecting, quantifying and classifying plant diseases. SpringerPlus; 2013; 2, pp. 660-672. [DOI: https://dx.doi.org/10.1186/2193-1801-2-660] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/24349961]
40. Almadhor, A.; Rauf, H.T.; Lali, M.I.U.; Damaševičius, R.; Alouffi, B.; Alharbi, A. AI-Driven Framework for Recognition of Guava Plant Diseases through Machine Learning from DSLR Camera Sensor Based High Resolution Imagery. Sensors; 2021; 21, 3830. [DOI: https://dx.doi.org/10.3390/s21113830]
41. Zhang, S.; You, Z.; Wu, X. Plant disease leaf image segmentation based on superpixel clustering and EM algorithm. Neural Comput. Appl.; 2019; 31, pp. 1225-1232. [DOI: https://dx.doi.org/10.1007/s00521-017-3067-8]
42. Singh, V.; Misra, A.K. Detection of plant leaf diseases using image segmentation and soft computing techniques. Inf. Process. Agric.; 2017; 4, pp. 41-49. [DOI: https://dx.doi.org/10.1016/j.inpa.2016.10.005]
43. Zhang, S.; Wang, H.; Huang, W.; You, Z. Plant diseased leaf segmentation and recognition by fusion of superpixel, K-means and PHOG. Optik; 2018; 157, pp. 866-872. [DOI: https://dx.doi.org/10.1016/j.ijleo.2017.11.190]
44. Storey, G.; Meng, Q.; Li, B. Leaf Disease Segmentation and Detection in Apple Orchards for Precise Smart Spraying in Sustainable Agriculture. Sustainability; 2022; 14, 1458. [DOI: https://dx.doi.org/10.3390/su14031458]
45. Hughes, D.P.; Salathé, M. An Open Access Repository of Images on Plant Health to Enable the Development of Mobile Disease Diagnostics. arXiv; 2015; 13. arXiv: 1511.08060
46. Kumar, J.P.; Domnic, S. Image based leaf segmentation and counting in rosette plants. Inf. Process. Agric.; 2019; 6, pp. 233-246.
47. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. NIPS; 2012; 60, pp. 84-90. [DOI: https://dx.doi.org/10.1145/3065386]
48. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Boston, MA, USA, 7–12 June 2015; pp. 1-9.
49. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv; 2016; arXiv: 1512.03385Available online: https://arxiv.org/abs/1512.03385 (accessed on 12 July 2022).
50. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the 3rd International Conference on Learning Representations; San Diego, CA, USA, 7–9 May 2015; pp. 1-14.
51. Hai, N.T.; Nguyen, N.T.; Nguyen, M.H.; Livatino, S. Wavelet-Based Kernel Construction for Heart Disease Classification. Adv. Electr. Electron. Eng. J.; 2019; 17, pp. 306-319. [DOI: https://dx.doi.org/10.15598/aeee.v17i3.3270]
52. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process.; 2004; 13, pp. 600-612. [DOI: https://dx.doi.org/10.1109/TIP.2003.819861]
53. Maeda-Gutiérrez, V.; Galván-Tejada, C.E.; Zanella-Calzada, L.A.; Celaya-Padilla, J.M.; Galván-Tejada, J.I.; Gamboa-Rosales, H.; Luna-García, H.; Magallanes-Quintanar, R.; Guerrero Méndez, C.A.; Olvera-Olvera, C.A. Comparison of Convolutional Neural Network Architectures for Classification of Tomato Plant Diseases. Appl. Sci.; 2020; 10, 1245. [DOI: https://dx.doi.org/10.3390/app10041245]
54. Trivedi, N.K.; Gautam, V.; Anand, A.; Aljahdali, H.M.; Villar, S.G.; Anand, D.; Goyal, N.; Kadry, S. Early Detection and Classification of Tomato Leaf Disease Using High-Performance Deep Neural Network. Sensors; 2021; 21, 7987. [DOI: https://dx.doi.org/10.3390/s21237987]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Tomato leaves can have different diseases which can affect harvest performance. Therefore, accurate classification for the early detection of disease for treatment is very important. This article proposes one classification model, in which 16,010 tomato leaf images obtained from the Plant Village database are segmented before being used to train a deep convolutional neural network (DCNN). This means that this classification model will reduce training time compared with that of the model without segmenting the images. In particular, we applied a VGG-19 model with transfer learning for re-training in later layers. In addition, the parameters such as epoch and learning rate were chosen to be suitable for increasing classification performance. One highlight point is that the leaf images were segmented for extracting the original regions and removing the backgrounds to be black using a hue, saturation, and value (HSV) color space. The segmentation of the leaf images is to synchronize the black background of all leaf images. It is obvious that this segmentation saves time for training the DCNN and also increases the classification performance. This approach improves the model accuracy to 99.72% and decreases the training time of the 16,010 tomato leaf images. The results illustrate that the model is effective and can be developed for more complex image datasets.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer