Full Text

Turn on search term navigation

Introduction

With the rapid economic development and continuous improvement of living standards, the types and quantities of waste generated are rapidly increasing. Improper waste management, including careless disposal, incineration, and landfilling, poses a severe threat to human health and causes significant harm to the environment on which humanity depends for survival [1]. According to a 2018 World Bank report, approximately 242 million tons of plastic waste were produced globally in 2016, accounting for 12% of the world’s total solid waste. By 2050, global waste production is projected to reach 3.4 billion tons annually, a significant increase from the current 2.01 billion tons [2]. Data shows that the total amount of waste generated rises yearly. If not dealt with by effective management measures, the environmental impact will be catastrophic and difficult to reverse [3].

Given the above, there is an urgent need to study a reasonable program to classify garbage effectively. The correct classification of garbage can bring multiple benefits to the plants and animals on earth as well as human life [4], which are mainly reflected in the following aspects: (1) Recycling part of the recyclable garbage can be recycled, thus reducing the amount of landfill by at least nearly 60%, thus effectively improving the efficiency of land use (2) Reducing the pollution of wastes, protecting the ecological environment, and improving the air quality. (3) Effectively recycle renewable resources. In daily life, 30% to 40% of waste can be recycled, and the recovery of these renewable wastes helps to increase the recovery rate and secondary utilization of resources. Therefore, it is crucial to reasonably classify and dispose of waste to achieve minimization, harmlessness and resourcefulness [5].

With the advancement of Internet technology and the rise of artificial intelligence, machine learning [6–9] and deep learning [10–13] have been rapidly developed and widely used in image classification. However, although both methods have achieved concrete results, each still has shortcomings. Machine learning requires manual design and feature selection, and capturing high-level semantic information from images is often challenging, thus limiting classification performance [14]. In addition, due to the small size of existing trash image datasets, machine learning and deep learning in trash image classification generally face problems such as low classification accuracy and lack of robustness. Meanwhile, the model size and detection speed need to be further optimized. Therefore, research on garbage image classification needs to explore more advanced and practical solutions.

This paper proposes a new garbage image classification model that uses ResNet-50 as the backbone network. In this model, a redundancy-weighted feature fusion module is proposed, combined with depth-separable convolution techniques to optimize ResNet-50. These two techniques effectively reduce the number of model parameters while improving classification efficiency. Additionally, the standard Focal Loss is weighted to mitigate the impact of class imbalance on model performance, enhancing the model’s robustness. Experimental results on the TrashNet dataset show that the proposed model significantly improves classification accuracy and robustness while maintaining fewer parameters and faster detection speed. The main contributions of this paper are as follows:

1. This paper proposes a Redundancy-weighted feature fusion module, aiming to reduce the computational cost of the network by removing duplicate information from multi-scale features. At the same time, the weight coefficients of feature information at different scales are weighted to ensure that the model can fully use valuable feature information for the task during the feature fusion process to improve the expressive ability and classification accuracy of the model.

2. In this paper, the standard 3×3 convolution in ResNet-50 is replaced with depth-separable convolution to reduce the number of parameters and computational complexity of the model. This improvement not only preserves the feature characterization capability of the original convolutional structure but also significantly improves the computational efficiency of the network, making the model more suitable for resource-constrained devices or environments.

3. In order to cope with the problem of category imbalance, we add a weighting factor to the Focal Loss. This design not only can effectively deal with samples that are difficult to classify but also alleviates the negative impact of category imbalance on model performance while ensuring the robustness of the overall model.

Related works

Researchers have recently proposed a series of garbage image classification methods, usually categorized into two main groups: machine learning-based models and convolutional neural network-based models.

Garbage image classification method based on machine learning

TrashNet data set, as a publicly available garbage classification data set, has attracted many researchers to adopt various machine learning algorithms to optimize classification accuracy. Yang et al. [15] adopted support vector machine (SVM) technology and achieved 63% classification accuracy. Costa et al. [16] used random forest (RF) and extreme gradient boost (XGBoost) algorithms to achieve an accuracy of 62.61% and 70%, respectively. Then, Satvilkar et al. [17] adopted the K- nearest neighbor (KNN) algorithm, significantly improving the classification performance and achieving a high accuracy rate of 88%.

In addition, there are a series of machine learning-based studies focused on the analysis of other waste datasets. For example, Wu et al. [18] manually extracted partial texture features from garbage images, preprocessed them, and finally classified them using the nearest neighbor method. Gundupalli et al. [19] initially utilized thermal imaging technology to obtain thermal images of electronic waste, then classified garbage images using the Otsu thresholding method. Bonifazi et al. [20] proposed a technique using shortwave infrared hyperspectral images to differentiate between low-density polyethene (LDPE) and high-density polyethene (HDPE) in mixed plastic waste, thereby increasing the utilization rate of construction waste. Xiao et al. [21] classified typical construction waste using near-infrared spectroscopy. Aziz et al. [22] introduced a rotation-invariant solid waste classification, recognition, and detection system. This system determined possible orientations of waste using Huffer curves and classified waste into three categories: “empty,” “partially filled,” and “filled,” using support vector machines. Riba et al. [23] solved the problem of garbage classification by analyzing the infrared spectrum of garbage images and counting multivariate variables. Huang et al. [24] researched the extraction and classification of colour features of construction waste. They identified construction waste using HSV threshold segmentation algorithms and K-means clustering algorithms, with an average processing time of 1.17 seconds. Zheng et al. [14] analyzed the volume of construction waste using the SFS algorithm and classified it using support vector machines.

Although machine learning-based garbage image classification models have achieved some success under specific conditions, their reliance on manually designed features makes it challenging to capture the complex spatial structures and semantic information in images. Furthermore, the limitations of domain knowledge may lead to poor model performance when dealing with unknown or complex garbage images. In addition, traditional machine learning methods are sensitive to noise, and issues like noise and image quality under suboptimal shooting conditions further degrade the model’s classification accuracy.

Garbage image classification method based on convolutional neural network

With the continuous improvement of computer computing power and the rapid development of the internet, deep learning simulates the information processing mechanism of the human brain, using multi-layer neural networks to automatically extract complex and abstract features from raw data, enabling more accurate data analysis and prediction. Deep learning models typically consist of an input layer (responsible for receiving raw data), hidden layers (accountable for extracting data features), and an output layer (which generates the final prediction or classification result). It has the following advantages:

* End-to-end learning: Deep learning enables end-to-end learning, where raw data is directly inputted to produce the final decision output without requiring intermediate transformation or processing steps [25].

* Automation of feature learning: Traditional machine learning methods typically require manual feature extraction, whereas deep learning can automatically learn valuable features from data, significantly reducing the amount of preprocessing work [25].

* Ability to handle complex tasks: Deep learning models can capture non-linear relationships within data, making them particularly effective for complex tasks such as image and speech recognition and natural language understanding [26].

* Driving force for research and development: Research in deep learning has driven the rapid growth of algorithms, hardware (such as GPUs and TPUs), and optimization techniques, providing momentum for the overall progress of artificial intelligence.

As one of the critical branches of deep learning, convolutional neural networks have made significant progress in various fields, including computer vision, natural language processing, and speech recognition. Since AlexNet [27] won the ImageNet image classification competition, researchers have successfully proposed a series of advanced algorithms by constantly improving the network architecture, increasing the network depth, optimizing the internal connection, and introducing new technologies, including classic architectures such as VGGNet [28], DenseNet [29], GoogleNet [30], MobileNetV2 [31], InceptionV2 [32], and ResNet [33], as well as new models such as iAFPs-Mv-BiTCN [34], AIPs-DeepEnC-GA [35], DeepAVP-TPPred [36], PAtbP-EnC [37], Deepstacked-AVPs [38] and AIPs-SnTCN [39] in recent years. The emergence of these models has incredibly advanced computer vision development and provided a solid theoretical foundation and technical support for applications in other fields.

As a result, convolutional neural networks are widely used in garbage image classification tasks. In 2016, Yang et al. [15] from Stanford University used support vector machines and CNN algorithms to train image classification models on the TrashNet dataset, with classification accuracies of 67% and 82%, respectively. Rabano et al. [40] put forward a new model based on the lightweight neural network MobileNet, which achieved an accuracy of 87.2% on the TrashNet data set. Aral et al. [41] used various networks (including DenseNet121, DenseNet169, MobileNet, Xception and InceptionResNetv2) to classify TrashNet data sets and found that DenseNet121 had the highest accuracy. Kennedy et al. [42] used the transfer learning strategy for reference to introduce the OscarNet model, which achieved a verification accuracy of 88.42% on TrashNet data sets. Bircanoğlu et al. [43] put forward a novel garbage image classification model(Recycle Net), which shows impressive performance on the TrashNet data set. Ruiz et al. [44] proposed an automatic garbage image classification algorithm based on ResNet, which achieved an average accuracy of 88.66% on the TrashNet data set. Adedeji et al. [45] use ResNet101 to extract features from the TrashNet dataset and then use a support vector machine instead of a complete connection layer for classification. Ozkaya et al. [46] combined different neural networks with different classifiers, compared their performances, and concluded that the combination of Google Network and SVM classifier produced the best classification result. Shi et al. [47] realized the effective fusion of feature information by widening network branches and increasing layers. Then, they proposed a model called M-b Xception, which was specially used for the classification task of garbage images. Zhang et al. [48] proposed that embedding the self-monitoring module in the residual network model can effectively integrate the relevant features of each channel graph and compress the spatial dimension information, thus significantly improving the representation ability of the feature graph. Ma et al. [49] proposed a new model based on ResNet-50, which was used to classify the TrashNet dataset and proved more accurate and robust than the existing model. Shi et al. [50] proposed a garbage classification model based on the multi-layer hybrid convolutional neural network by adjusting the number of network modules and channel widths, achieving a classification accuracy of 92.6% on the TrashNet dataset. The recyclable waste classification model proposed by Hossen et al. [51] surpassed several state-of-the-art models with an accuracy of 95.01% on the TrashNet dataset and validated the model’s reliability through class activation mapping.

In addition, some convolutional neural network-based garbage image classification models were studied based on self-built garbage datasets. Alsubaei et al. [52] developed a deep learning model focused on small-scale garbage waste detection and classification to assist intelligent waste management systems. The model combined an improved IRD model and the AOA arithmetic optimization algorithm for object detection and used a functional linkage neural network for classification. Liu et al. [53] proposed a garbage image recognition model based on transfer learning and model fusion. Experimental results on their self-built dataset showed that the model had better convergence and accuracy. Li et al. [54] addressed the problems of overfitting and poor convergence in traditional image recognition algorithms by proposing a deep learning-based garbage image recognition algorithm. The algorithm overcame overfitting by introducing Dropout, adjusted the parameters of the deep neural network using the Adagrad adaptive method, and utilized the ReLU activation function to solve the gradient vanishing problem in neural network training.

Although convolutional neural network-based garbage image classification models show great potential, they still have shortcomings. First, garbage image datasets are usually small and imbalanced, which causes the model to overfit during training, thus affecting the model’s generalization ability and classification performance. Second, garbage images have diverse and inconsistent visual features, such as varying lighting conditions, occlusions, or object overlaps, which make the feature extraction process more complex and affect the model’s performance in different scenarios. Additionally, some models focus too much on classification accuracy, leading to excessive model parameters and significantly reducing detection speed.

Proposed methods

This paper presents a detailed comparative analysis of the accuracy, loss, and number of parameters of the more popular image classification models on the CIFAR-10 dataset, covering VGGNet, DenseNet, GoogleNet, MobileNet, InceptionNet, and ResNet. The specific results are shown in Table 1. The analysis results show that the ResNet series of models excel in various indicators, and the number of parameters of the model grows accordingly with the increase of the model depth, resulting in increased hardware requirements and slower computation speed. Based on carefully considering classification accuracy and computational efficiency, this paper selects ResNet-50 as the backbone network of the garbage image classification model, and its specific structure is shown in Fig 1.

[Figure omitted. See PDF.]

In addition, it is well known that the performance of deep learning models is closely related to the size of the training dataset. Due to the small size of the garbage image dataset and the uneven distribution of categories, the effect of directly using ResNet-50 to classify it is not ideal. Therefore, this paper improves the original ResNet-50, and the structure and parameters of the improved model are presented in Table 2.

[Figure omitted. See PDF.]

Redundancy-weighted feature fusion

The standard ResNet-50 network extracts features from the original image by 7×7 convolution (Fig 2(a)). While this approach can quickly expand the receptive field and extract high-level features, it is weak in capturing local details of the image. In addition, the direct use of a sizeable convolutional kernel acting on the original image at shallower layers of the network may not be conducive to the gradual construction of complex feature hierarchies by the model. Existing multiscale feature fusion methods (Fig 2(b)) usually adopt two approaches, add and concatenate: add enhances the information content of each dimension without increasing the dimensionality by superimposing the features. In contrast, concatenate increases the dimensionality by feature splicing, but the information content of each dimension remains unchanged. However, the above approaches may introduce a large amount of redundant information, and the features of different scales have their characteristics, so such processing cannot fully feature information of different scales. Therefore, this paper proposes a redundancy-weighted feature fusion module, as shown in Fig 2(c), to improve the initial process of the original ResNet-50 network in the feature extraction stage.

[Figure omitted. See PDF.]

(a)shallow feature extraction structure of standard ResNet-50 network. (b)Existing multi-scale feature fusion methods. (c)The specific structure of the redundancy-weighted feature fusion module.

First, this paper employs three parallel convolutional kernels of different sizes (7×7, 5×5, and 3×3) to extract features from the original image. The smaller kernels effectively capture detailed features, while the larger kernels are more suitable for extracting global and macro information. However, these features at different scales often contain a significant amount of redundant information, and directly using these features may increase the computational cost of the network. Therefore, in this paper, the redundant information is eliminated before the multi-scale feature fusion, and the specific formula is as follows:(1) (2) (3)

Where x denotes the original image, Conv_7×7;2(x), Conv_5×5;2(x), and Conv_3×3;2(x) denote the feature maps obtained from the convolution operation on the original image using convolution kernels of different sizes such as 7×7, 5×5 and 3×3, respectively. Since feature maps of different scales contain their unique information, weighting the above feature maps with weight coefficients aims to utilize this information more fully, as in Eq (4).(4)

Here, Feature denotes the final multiscale feature fusion tensor, Feature₁, Feature₂, and Feature₃ denote the different scale feature tensors after removing redundant information, respectively, and α, β, and γ are their corresponding weight coefficients, collectively known as weight coefficient combinations. The weight coefficients α, β, and γ are adaptively learned through backpropagation and gradient descent algorithms in the network training process. Specifically, the network first randomly initializes these weight coefficients. Then, in each iteration, the network multiplies Feature₁, Feature₂, and Feature₃ by their corresponding weight coefficients α, β, and γ, and adds them together to obtain the Feature. Next, the network computes the loss between the predicted results and the truth labels and backpropagates to compute the loss gradients concerning the weight coefficients. Using these gradients, the weight coefficients are updated by applying the learning rate to minimize the loss function. This process is repeated until the network finds an optimal set of weight coefficients that satisfy Eq (5). Through the above design, the redundancy-weighted feature fusion module eliminates the coexisting repetitive information between features of different scales, reduces the computational resource consumption of the network, and, at the same time, ensures that the information related to the current task is thoroughly mined during the feature fusion process, thus enhancing the effectiveness of the model.(5)

Depth-separable convolution

Deep-separable convolution (Fig 3(b)) consists of channel-by-channel convolution and point-by-point convolution. Unlike the standard 3×3 convolution (Fig 3(a)), channel-by-channel convolution uses a single-channel convolution kernel that acts independently on each channel of the input feature map to generate an output feature map with the same number of channels as the input feature map. However, channel-by-channel convolution may result in too few channels of the output feature map, affecting the adequate representation of information. To solve this problem, point-by-point convolution increases the number of channels by operating on the feature map through a 1×1 convolution kernel, thereby enriching the feature information and improving the representation capability of the network. Assuming that the input feature map is , and the output feature map is , the number of parameters and the number of computations required for the standard 3×3 convolutional operation in ResNet-50 are given by Eqs (6) and (7).(6) (7)

[Figure omitted. See PDF.]

(a)Standard 3×3 convolution operation in Resnet-50. (b)Depth-divisible convolution operation.

The number of parameters and the computational effort required for the depth-separable convolution operation are given by Eqs (8) and (9)(8) (9)

Compared with the standard 3×3 convolution operation, depth-separable convolution significantly reduces the number of parameters and computation of the model. Therefore, this paper uses depth-separable convolution to replace the standard 3×3 convolution in ResNet-50 to optimize the model structure, reduce the computational burden, and accelerate the inference speed so it performs better in resource-constrained scenarios. In addition, due to the reduction of the number of parameters, the deep separable convolution is less dependent on the data during the training process, which reduces the risk of model overfitting and further improves the model’s generalization ability.

Model loss functions

The vast majority of existing image classification models use the traditional cross-entropy loss function to measure the difference between the actual probability distribution of the samples and the predicted probability distribution and instruct the model to update its parameters during training to reduce this difference and thus improve the accuracy of the prediction. The definition of cross-entropy loss is as follows:(10)

In Equation Eq (10), N denotes the number of samples, C is the number of categories, and y_i,c and p_i,c denote the actual probability distribution and probability distribution of sample i, respectively. However, the cross-entropy loss assigns the same weight to the loss of each sample, which means that in the case of category imbalance, the model may favour the frequently occurring categories and ignore the characteristics of rare categories, thus weakening the model’s ability to recognize the few categories. In order to solve this problem, Focal Loss is developed, which introduces a moderating factor on top of the cross-entropy loss, aiming to enhance the model’s focus on the difficult-to-classify samples and thus improve the model’s classification performance in the category imbalance scenario. Focal loss is defined as in Equation Eq (11).(11)

Where λ is a hyperparameter, usually taking the value of 2, used to control how much attention is paid to hard-to-categorize samples. Although Focal loss has been widely used, it is more inclined to focus on the features of hard-to-classify samples and may pay insufficient attention to the features of a few sample categories. In addition, the difficulty of sample classification varies with the dynamic process of model training. For this reason, we weight the Focal loss to enhance the model’s ability to learn sparse categories by assigning lower weights to categories with large samples and higher weights to categories with small samples. The model loss function is defined as follows.(12)

Where δ_i is the weighting coefficient, its value is inversely proportional to the number of samples in each class within the imbalanced waste image dataset. In this paper, we apply weighting to Focal Loss to effectively handle difficult-to-classify samples while also alleviating the negative impact of class imbalance on model performance, ensuring the robustness of the entire model.

Experiments

Experimental parameter settings and details

In this paper, the Python programming language and PyTorch deep learning framework are used for the construction and training of network models. The experimental equipment includes an AMD Riptide 7 5800X CPU with 64GB of RAM and an NVIDIA RTX 3090 GPU with 24GB of video memory. To fully utilize the GPU performance, the experimental environment integrates CUDA 11.1 and its deep neural network library (CUDNN). During model training, the initial learning rate was set to 1e-3, and the parameters were updated using the AdamW optimizer. To prevent the model from overfitting, the weight decay factor was set to 1e-9. In addition, the learning rate used an exponential decay mechanism that decays to 96% of the original value every 1.3 cycles. The batch size of the whole training process was 12, and the total number of training rounds was 250.

TrashNet dataset

All experiments in this paper are conducted on the TrashNet dataset, which consists of six categories of RGB images: cardboard, glass, metal, paper, plastic, and trash. The “trash” category includes items that cannot be classified into the other five categories, such as ceramic fragments and rubber products. The dataset contains a total of 2,527 images, each with a resolution of 512×384 pixels. Specifically, the dataset comprises 403 images of cardboard, 501 of glass, 410 of metal, 594 of paper, 482 of plastic, and 137 of general trash. Fig 4 presents representative images for each of the six categories. In this study, each image is resized to 224×224 pixels, and pixel values are normalized to a range of 0 to 1. The dataset is then split into training and validation sets in an 8:2 ratio.

[Figure omitted. See PDF.]

Evaluation index

It is well known that the parameters of the model will gradually stabilize as the number of training cycles increases. However, the accuracy and loss will still fluctuate within a small range. Therefore, in this paper, the average accuracy and loss of the last ten training cycles are calculated as the final accuracy and loss. The expressions are as follows.(13) (14)

Where Accuracy_e and Loss_e denote the classification accuracy and loss of the model on the TrashNet test set after completing the e − th round of training, respectively, and e represents the current training round of the model.

Experimental results nalysis

First, this study conducted a comprehensive comparative analysis of several popular image classification models with the model proposed in this paper on the TrashNet dataset. The main comparison metrics include classification accuracy (Accuracy_avg), loss (Loss_avg), the time required to validate a single image (S − IValtime), and the number of model parameters, as detailed in Table 3. Analyzing the data in Table 3 reveals that, compared to MobileNetV3 [31], which has the fewest parameters, our model significantly improves accuracy on the TrashNet test set by 11.84%. Although the number of model parameters increased by 27,400,940 and the time required to validate a single image also increased by 43.47ms, these increases were necessary to achieve higher classification precision. In comparison to the ResNet-50 [33], which has the highest accuracy, our model achieved a 9.61% increase in accuracy on the TrashNet test set. In this case, the number of model parameters increased by 7,261,898, and the time required to validate a single image only increased by 9.23ms. This indicates that our model maintains good performance in terms of computational resource consumption while achieving high accuracy. In summary, the model proposed in this paper demonstrates significant improvements in classification performance. Although there are increases in the number of parameters and inference time, these costs are reasonable and worthwhile relative to the improvement in accuracy.

[Figure omitted. See PDF.]

In addition, we randomly selected 12 images from the TrashNet test set and used the Grad-CAM technique to generate the corresponding heatmaps, which were then overlaid on the original images to visually demonstrate the areas the model focused on during classification, as shown in Fig 5. In the heatmaps, the red and yellow regions indicate areas that received higher attention from the model. The analysis of the results shows that the proposed model effectively focuses on the objects in the images, demonstrating its ability to capture rich image information and thus achieve high classification accuracy.

[Figure omitted. See PDF.]

This paper also compares the classification accuracies of existing machine learning and deep learning-based garbage image classification models with the model proposed in this paper on the TrashNet dataset, and the results are shown in Table 4. It can be seen that the accuracy of the model in this paper is improved by 6.13% compared to the best-performing machine learning-based model [17]. However, the accuracy is slightly decreased compared to the best-performing model in deep learning [48]. Although there is room for improvement in the accuracy of this paper’s model compared to some of the top models, it is worth noting that this paper’s model omits the pre-training step, and the overall model size is small. This design not only simplifies the deployment and maintenance of the model but also improves operational efficiency and cost-effectiveness while ensuring higher classification performance. In addition, given that ResNet-50-A [49] and ResNet-50-B [49] also use ResNet-50 as the network backbone, and they are both advanced models proposed in recent years, this paper will further conduct an in-depth comparison and analysis between them and the model in this paper to verify the validity and feasibility of the model in this paper.

[Figure omitted. See PDF.]

To ensure a fair comparison, ResNet-50, ResNet-50-A, ResNet-50-B, and the model proposed in this paper, all utilized the same experimental parameters described in the “Experimental Parameter Settings” section. Fig 6 shows the loss and accuracy curves of the four models over 250 training epochs; it is easy to see that these models gradually stabilize after approximately 120 training epochs, suggesting that the training process is converging and performance is improving. We also compared the models’ average accuracy, average loss, the time required for validating a single image, and the number of model parameters with the specific results presented in Table 5. Analyzing Table 5 reveals that the accuracy of our model improved by 2.05% compared to the best-performing ResNet-50-B, the number of model parameters decreased by 38,679,954, and the time required to validate a single image was reduced by 9.22 milliseconds. Given that the classification performance of ResNet-50-B is significantly better than that of ResNet-50-A, we further present the confusion matrices for ResNet-50, ResNet-50-B, and our model in the six-class waste classification task in Fig 7. Observing these confusion matrices shows that our model’s classification accuracy has effectively improved across all six categories, further validating the superiority and effectiveness of our model.

[Figure omitted. See PDF.]

(a)Loss curves of different models on TrashNet dataset. (b)Accuracy curves of different models on TrashNet dataset.

[Figure omitted. See PDF.]

(a)ResNet-50 (b)ResNet-50-B (c)Ours.

[Figure omitted. See PDF.]

Additionally, it is well-known that convolutional neural networks are susceptible to external factors. For instance, when the environment around an object changes or the object is partially occluded, the convolutional neural network may make incorrect predictions. In real-world waste classification scenarios, garbage samples often appear in incomplete or obscured forms, making the robustness of the model a vital consideration, as it directly impacts overall classification performance. For this reason, this study divides images into nine regions and applies occlusions to each region, creating nine new test sets (as shown in Fig 8). Subsequently, performance evaluations were conducted on ResNet-50, ResNet-50-B, and our proposed model using these nine new test sets, as illustrated in Fig 9. The results indicate that our model achieved the highest classification accuracy across all nine new test sets. This outcome highlights the robustness and reliability of our model and further validates its adaptability and practical application potential when facing various complex situations.

[Figure omitted. See PDF.]

Discussion

This paper proposes a new garbage image classification model and comprehensively compares it with the existing machine learning and deep learning-based garbage image classification models on the TrashNet dataset, including several metrics such as accuracy, total model parameters, prediction speed, and robustness. In terms of accuracy, the model in this paper improves by 6.13% compared to the best machine learning-based model. Although the accuracy of this paper’s model is slightly less than that of the best-performing deep learning model, it does not require a pre-training process, and the overall model size is small, significantly improving operational efficiency and cost-effectiveness.

In terms of model robustness, to address the susceptibility of convolutional neural networks to external interference, this paper creates nine new test sets by dividing the image into nine regions and masking them one by one to evaluate the model’s performance under complex conditions comprehensively. The results show that the model in this paper achieves the highest classification accuracy on all the new test sets, which further validates its ability to cope with incomplete or occluded samples in real-world garbage classification scenarios and highlights the robustness and reliability of the model.

Although the garbage image classification model proposed in this paper performs well in experiments, it still has some limitations. First, the TrashNet dataset is relatively small and suffers from class imbalance. Although we have mitigated this issue by improving the Focal Loss function, the model may still be influenced by data distribution biases when handling more complex and diverse garbage images in real-world scenarios, leading to overfitting and, consequently, reduced generalization ability. Second, there is a limited number of publicly available garbage image datasets, and this paper only uses the TrashNet dataset for experimentation. As a result, the proposed model may lack sufficient representativeness, and its performance may decline when dealing with different environments or new types of garbage, thus affecting its applicability in real-world situations.

Conclusion

With the continuous improvement of people’s material living standards, the types and quantities of garbage are also increasing rapidly, leading to an increasingly urgent need for garbage classification. Aiming at the problems of low accuracy, insufficient robustness, slow model detection speed, and large scale of existing garbage image classification models, this paper proposes a new garbage image classification model that uses ResNet-50 as the backbone network. Specifically, this paper proposes a redundancy-weighted feature fusion module combined with depthwise separable convolution techniques to optimize ResNet-50. In addition, the standard Focal Loss is weighted to mitigate the impact of class imbalance on model performance. Experimental results on the TrashNet dataset show that the proposed model significantly improves classification accuracy and robustness while maintaining fewer parameters and faster detection speed.

Given the issues of insufficient sample size and class imbalance in garbage image datasets such as TrashNet, future research could explore the use of advanced techniques like Generative Adversarial Networks to synthesize additional training data, particularly for the minority classes in the dataset, in order to further enhance the model’s performance. In addition, future work could consider building and expanding a custom garbage image dataset through field collection or collaborative sharing. Such a dataset should cover a broader range of environmental conditions, garbage types, and shape variations, thus providing a solid foundation for training more robust and accurate garbage image classification models.

References

1. 1. Liu Kang and Tan Quanyin and Yu Jiadong and Wang Mengmeng. A global perspective on e-waste recycling. Circular Economy. 2023;2(1):100028.

* View Article

* Google Scholar

2. 2. Kaza Silpa and Yao Lisa and Bhada-Tata Perinaz and Van Woerden Frank. What a waste 2.0: a global snapshot of solid waste management to 2050. World Bank Publications. 2018.

* View Article

* Google Scholar

3. 3. Ersahin Mustafa Evren and Cicekalan Busra and Cengiz Ali Izzet and Zhang Xuedong and Ozgun Hale. Nutrient recovery from municipal solid waste leachate in the scope of circular economy: Recent developments and future perspectives. Journal of Environmental Management. 2023;335:117518. pmid:36841005

* View Article

* PubMed/NCBI

* Google Scholar

4. 4. Cheng Baoquan and Huang Jianling and Guo Ziliang and Li Jianchang and Chen Huihua. Towards sustainable construction through better construction and demolition waste management practices: a SWOT analysis of Suzhou, China. International Journal of Construction Management. 2023;23(15):2614–2624.

* View Article

* Google Scholar

5. 5. Meena MD and Dotaniya ML and Meena BL and Rai PK and Antil RS and Meena HS, et al. Municipal solid waste: Opportunities, challenges and management policies in India: A review. Waste Management Bulletin. 2023;1(1):4–18.

* View Article

* Google Scholar

6. 6. Suthaharan Shan and Suthaharan Shan. Support vector machine. Machine learning models and algorithms for big data classification: thinking with examples for effective learning. 2016;207–235.

* View Article

* Google Scholar

7. 7. Peterson Leif E. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.

* View Article

* Google Scholar

8. 8. Suthaharan Shan and Suthaharan Shan. Decision tree learning. Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning. 2016;237–269.

* View Article

* Google Scholar

9. 9. Rigatti Steven J. Random forest. Journal of Insurance Medicine. 2017;47(1):31–39. pmid:28836909

* View Article

* PubMed/NCBI

* Google Scholar

10. 10. Li Z. and Liu F. and Yang W. and others. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems. 2021;33(12):6999–7019.

* View Article

* Google Scholar

11. 11. Wu Z. and Pan S. and Chen F. and others. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems. 2020;32(1):4–24.

* View Article

* Google Scholar

12. 12. Gu, J. and Tresp, V. and Hu, H. Capsule network is not more robust than convolutional network. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;14309–14317.

13. 13. Bank, D. and Koenigstein, N. and Giryes, R. Autoencoders. Machine learning for data science handbook: data mining and knowledge discovery handbook. 2023;353–374.

14. 14. Zheng Longhai and Yuan Zuqiang and Yin Chenbo and others. Research on Automatic Classification System of Construction Waste Based on Machine Vision. Journal of Hebei Normal University of Science & Technology. 2019;003:6–12.

* View Article

* Google Scholar

15. 15. Yang Mindy and Thung Gary. Classification of trash for recyclability status. CS229 project report. 2016;1(1):3.

* View Article

* Google Scholar

16. 16. Costa, Bernardo S and Bernardes, Aiko CS and Pereira, Julia VA and Zampa, Vitoria H and Pereira, Vitoria A and Matos, Guilherme F, et al. Artificial intelligence in automated sorting in trash recycling. Encontro Nacional de Inteligência Artificial e Computacional (ENIAC). 2018;198–205.

17. 17. Satvilkar, Mandar. Image based trash classification using machine learning algorithms for recyclability status. Dublin, National College of Ireland. 2018.

18. 18. Wu J. and Chen H. and Fang W. Research on Waste Garbage Analysis and Recognition Based on Computer Vision. Information Technology & Informatization. 2016;10:3–11.

* View Article

* Google Scholar

19. 19. Gundupalli Sathish Paulraj and Hait Subrata and Thakur Atul. Classification of metallic and non-metallic fractions of e-waste using thermal imaging-based technique. Process Safety and Environmental Protection. 2018;118:32–39.

* View Article

* Google Scholar

20. 20. Bonifazi Giuseppe and Capobianco Giuseppe and Serranti Silvia. A hierarchical classification approach for recognition of low-density (LDPE) and high-density polyethylene (HDPE) in mixed plastic waste based on short-wave infrared (SWIR) hyperspectral imaging. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy. 2018;198:115–122. pmid:29525562

* View Article

* PubMed/NCBI

* Google Scholar

21. 21. Xiao Wen and Yang Jianhong and Fang Huaiying and Zhuang Jiangteng and Ku Yuedong. A robust classification algorithm for separation of construction waste using NIR hyperspectral system. Waste Management. 2019;90:1–9. pmid:31088664

* View Article

* PubMed/NCBI

* Google Scholar

22. 22. Aziz Fayeem and Arof Hamzah and Mokhtar Norrima and Mubin Marizan and Talip Mohamad Sofian Abu. Rotation invariant bin detection and solid waste level classification. Measurement. 2015;65:19–28.

* View Article

* Google Scholar

23. 23. Riba Jordi-Roger and Cantero Rosa and Canals Trini and Puig Rita. Circular economy of post-consumer textile waste: Classification through infrared spectroscopy. Journal of Cleaner Production. 2020;272:123011.

* View Article

* Google Scholar

24. 24. Huang H. and Han J. and Wu F. and others. Research on Color Feature Extraction and Classification of Construction Waste. Optics and Optoelectronics Technology. 2018;16(1):5–13.

* View Article

* Google Scholar

25. 25. Jin Shoufeng and Yang Zixuan and Królczykg Grzegorz and Liu Xinying and Gardoni Paolo and Li Zhixiong. Garbage detection and classification using a new deep learning-based machine vision system as a tool for sustainable waste recycling. Waste Management. 2023;162:123–130. pmid:36989995

* View Article

* PubMed/NCBI

* Google Scholar

26. 26. Yan Tan Hor and Azam Sazuan Nazrah Mohd and Sani Zamani Md and Azizan Azizul. Accuracy study of image classification for reverse vending machine waste segregation using convolutional neural network. International Journal of Electrical and Computer Engineering (IJECE). 2024;14(1):366–374.

* View Article

* Google Scholar

27. 27. Krizhevsky Alex and Sutskever Ilya and Hinton Geoffrey E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25.

* View Article

* Google Scholar

28. 28. Simonyan, Karen and Zisserman, Andrew. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014.

29. 29. Huang, Gao and Liu, Zhuang and Van Der Maaten, Laurens and Weinberger, Kilian Q. Densely connected convolutional networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017;4700–4708.

30. 30. Szegedy, Christian and Liu, Wei and Jia, Yangqing and Sermanet, Pierre and Reed, Scott and Anguelov, Dragomir, et al. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015;1–9.

31. 31. Howard, Andrew G and Zhu, Menglong and Chen, Bo and Kalenichenko, Dmitry and Wang, Weijun and Weyand, Tobias, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. 2017.

32. 32. Szegedy, Christian and Vanhoucke, Vincent and Ioffe, Sergey and Shlens, Jonathon and Wojna, Zbigniew. Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016;2818–2826.

33. 33. He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016;770–778.

34. 34. Akbar S. and Zou Q. and Raza A. and others. iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks. Artificial Intelligence in Medicine. 2024;151:102860. pmid:38552379

* View Article

* PubMed/NCBI

* Google Scholar

35. 35. Raza A. and Uddin J. and Zou Q. and others. AIPs-DeepEnC-GA: predicting anti-inflammatory peptides using embedded evolutionary and sequential feature integration with genetic algorithm based deep ensemble model. Chemometrics and Intelligent Laboratory Systems. 2024;254:105239.

* View Article

* Google Scholar

36. 36. Ullah M. and Akbar S. and Raza A. and others. DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm. Bioinformatics. 2024;40(5):btae305. pmid:38710482

* View Article

* PubMed/NCBI

* Google Scholar

37. 37. Akbar S. and Raza A. and Al Shloul T. and others. PAtbP-EnC: Identifying anti-tubercular peptides using multi-feature representation and genetic algorithm-based deep ensemble model. IEEE Access. 2023;11:137099–137114.

* View Article

* Google Scholar

38. 38. Akbar S. and Raza A. and Zou Q. Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model. BMC bioinformatics. 2024;25(1):102. pmid:38454333

* View Article

* PubMed/NCBI

* Google Scholar

39. 39. Raza A. and Uddin J. and Almuhaimeed A. and others. AIPs-SnTCN: Predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks. Journal of chemical information and modeling. 2023;63(21):6537–6554. pmid:37905969

* View Article

* PubMed/NCBI

* Google Scholar

40. 40. Rabano, Stephenn L and Cabatuan, Melvin K and Sybingco, Edwin and Dadios, Elmer P and Calilung, Edwin J. Common garbage classification using mobilenet. 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM). 2018;1–4.

41. 41. Aral, Rahmi Arda and Keskin, Şeref Recep and Kaya, Mahmut and Hacıoğlu, Murat. Classification of trashnet dataset based on deep learning models. 2018 IEEE International Conference on Big Data (Big Data). 2018;2058–2062.

42. 42. Kennedy, Tom. OscarNet: Using transfer learning to classify disposable waste. CS230 Report: Deep Learning. Stanford University, CA, Winter. 2018.

43. 43. Bircanoğlu, Cenk and Atay, Meltem and Besser, Fuat and Genc, Özgün and Kızrak, Merve Ayyuce. RecycleNet: Intelligent waste sorting using deep neural networks. 2018 Innovations in intelligent systems and applications (INISTA). 2018;1–7.

44. 44. Ruiz, Victoria and Sánchez, Ángel and Vélez, José F and Raducanu, Bogdan. Automatic image-based waste classification. From Bioinspired Systems and Biomedical Applications to Machine Learning: 8th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2019, Almería, Spain, June 3–7, 2019, Proceedings, Part II 8. 2019;422–431.

45. 45. Adedeji Olugboja and Wang Zenghui. Intelligent waste classification system using deep learning convolutional neural network. Procedia Manufacturing. 2019;35:607–612.

* View Article

* Google Scholar

46. 46. Ozkaya, Umut and Seyfi, Levent. Fine-tuning models comparisons on garbage classification for recyclability. arXiv preprint arXiv:1908.04393. 2019.

47. 47. Shi Cuiping and Xia Ruiyang and Wang Liguo. A novel multi-branch channel expansion network for garbage image classification. IEEE access. 2020;8:154436–154452.

* View Article

* Google Scholar

48. 48. Zhang Qiang and Zhang Xujuan and Mu Xiaojun and Wang Zhihe and Tian Ran and Wang Xiangwen, et al. Recyclable waste image recognition based on deep learning. Resources, Conservation and Recycling. 2021;171:105636.

* View Article

* Google Scholar

49. 49. Ma Xiaoxuan and Li Zhiwen and Zhang Lei. An improved ResNet-50 for garbage image classification. Tehnički vjesnik. 2022;29(5):1552–1559.

* View Article

* Google Scholar

50. 50. Shi Cuiping and Tan Chao and Wang Tian. A waste classification method based on a multilayer hybrid convolution neural network. Applied Sciences, 2021; 11(18): 8572.

* View Article

* Google Scholar

51. 51. Hossen Mohammad Mahfuzur and Majid Mohammad Elias and Kashem Shama Binte Anwar and others. A reliable and robust deep learning model for effective recyclable waste classification. IEEE Access, 2024.

* View Article

* Google Scholar

52. 52. Alsubaei Faisal S and Al-Wesabi F N and Hilal A M. Deep learning-based small object detection and classification model for garbage waste management in smart cities and IoT environment. Applied Sciences, 2022; 12(5): 2281.

* View Article

* Google Scholar

53. 53. Liu Wei and Ouyang Hui and Liu Qi and others. Image recognition for garbage classification based on transfer learning and model fusion. Mathematical Problems in Engineering, 2022; 2022(1): 4793555.

* View Article

* Google Scholar

54. 54. Li Yicheng and Liu Wei. Deep learning-based garbage image recognition algorithm. Applied Nanoscience, 2023; 13(2): 1415–1424.

* View Article

* Google Scholar

Citation: Li L, Wang R, Zou M, Guo F, Ren Y (2025) Enhanced ResNet-50 for garbage classification: Feature fusion and depth-separable convolutions. PLoS ONE 20(1): e0317999. https://doi.org/10.1371/journal.pone.0317999

About the Authors:

Lingbo Li

Contributed equally to this work with: Lingbo Li, Runpu Wang

Roles: Conceptualization, Methodology, Validation, Writing – original draft, Writing – review & editing

Affiliation: Library of Information Center, Zhejiang Technical Institute of Economics, Hangzhou, China

Runpu Wang

Contributed equally to this work with: Lingbo Li, Runpu Wang

Roles: Investigation, Software, Validation

Affiliation: School of Computer Science Engineering, University of New South Wales, Canberra, Australia

Miaojie Zou

Roles: Conceptualization, Software, Validation

Affiliation: Faculty of Business and Economics, Monash University, Melbourne, Australia

Fusen Guo

Roles: Data curation, Validation

Affiliation: School of Systems and Computing, University of New South Wales, Canberra, Australia

ORICD: https://orcid.org/0009-0007-9142-697X

Yuheng Ren

Roles: Methodology, Validation, Writing – original draft, Writing – review & editing

E-mail: [email protected]

Affiliation: School of Business Economics, European Union University, Montreux, Switzerland

ORICD: https://orcid.org/0009-0002-9679-9947

[/RAW_REF_TEXT]

References

1. Liu Kang and Tan Quanyin and Yu Jiadong and Wang Mengmeng. A global perspective on e-waste recycling. Circular Economy. 2023;2(1):100028.

2. Kaza Silpa and Yao Lisa and Bhada-Tata Perinaz and Van Woerden Frank. What a waste 2.0: a global snapshot of solid waste management to 2050. World Bank Publications. 2018.

3. Ersahin Mustafa Evren and Cicekalan Busra and Cengiz Ali Izzet and Zhang Xuedong and Ozgun Hale. Nutrient recovery from municipal solid waste leachate in the scope of circular economy: Recent developments and future perspectives. Journal of Environmental Management. 2023;335:117518. pmid:36841005

4. Cheng Baoquan and Huang Jianling and Guo Ziliang and Li Jianchang and Chen Huihua. Towards sustainable construction through better construction and demolition waste management practices: a SWOT analysis of Suzhou, China. International Journal of Construction Management. 2023;23(15):2614–2624.

5. Meena MD and Dotaniya ML and Meena BL and Rai PK and Antil RS and Meena HS, et al. Municipal solid waste: Opportunities, challenges and management policies in India: A review. Waste Management Bulletin. 2023;1(1):4–18.

6. Suthaharan Shan and Suthaharan Shan. Support vector machine. Machine learning models and algorithms for big data classification: thinking with examples for effective learning. 2016;207–235.

7. Peterson Leif E. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.

8. Suthaharan Shan and Suthaharan Shan. Decision tree learning. Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning. 2016;237–269.

9. Rigatti Steven J. Random forest. Journal of Insurance Medicine. 2017;47(1):31–39. pmid:28836909

10. Li Z. and Liu F. and Yang W. and others. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems. 2021;33(12):6999–7019.

11. Wu Z. and Pan S. and Chen F. and others. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems. 2020;32(1):4–24.

12. Gu, J. and Tresp, V. and Hu, H. Capsule network is not more robust than convolutional network. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;14309–14317.

13. Bank, D. and Koenigstein, N. and Giryes, R. Autoencoders. Machine learning for data science handbook: data mining and knowledge discovery handbook. 2023;353–374.

14. Zheng Longhai and Yuan Zuqiang and Yin Chenbo and others. Research on Automatic Classification System of Construction Waste Based on Machine Vision. Journal of Hebei Normal University of Science & Technology. 2019;003:6–12.

15. Yang Mindy and Thung Gary. Classification of trash for recyclability status. CS229 project report. 2016;1(1):3.

16. Costa, Bernardo S and Bernardes, Aiko CS and Pereira, Julia VA and Zampa, Vitoria H and Pereira, Vitoria A and Matos, Guilherme F, et al. Artificial intelligence in automated sorting in trash recycling. Encontro Nacional de Inteligência Artificial e Computacional (ENIAC). 2018;198–205.

17. Satvilkar, Mandar. Image based trash classification using machine learning algorithms for recyclability status. Dublin, National College of Ireland. 2018.

18. Wu J. and Chen H. and Fang W. Research on Waste Garbage Analysis and Recognition Based on Computer Vision. Information Technology & Informatization. 2016;10:3–11.

19. Gundupalli Sathish Paulraj and Hait Subrata and Thakur Atul. Classification of metallic and non-metallic fractions of e-waste using thermal imaging-based technique. Process Safety and Environmental Protection. 2018;118:32–39.

20. Bonifazi Giuseppe and Capobianco Giuseppe and Serranti Silvia. A hierarchical classification approach for recognition of low-density (LDPE) and high-density polyethylene (HDPE) in mixed plastic waste based on short-wave infrared (SWIR) hyperspectral imaging. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy. 2018;198:115–122. pmid:29525562

21. Xiao Wen and Yang Jianhong and Fang Huaiying and Zhuang Jiangteng and Ku Yuedong. A robust classification algorithm for separation of construction waste using NIR hyperspectral system. Waste Management. 2019;90:1–9. pmid:31088664

22. Aziz Fayeem and Arof Hamzah and Mokhtar Norrima and Mubin Marizan and Talip Mohamad Sofian Abu. Rotation invariant bin detection and solid waste level classification. Measurement. 2015;65:19–28.

23. Riba Jordi-Roger and Cantero Rosa and Canals Trini and Puig Rita. Circular economy of post-consumer textile waste: Classification through infrared spectroscopy. Journal of Cleaner Production. 2020;272:123011.

24. Huang H. and Han J. and Wu F. and others. Research on Color Feature Extraction and Classification of Construction Waste. Optics and Optoelectronics Technology. 2018;16(1):5–13.

25. Jin Shoufeng and Yang Zixuan and Królczykg Grzegorz and Liu Xinying and Gardoni Paolo and Li Zhixiong. Garbage detection and classification using a new deep learning-based machine vision system as a tool for sustainable waste recycling. Waste Management. 2023;162:123–130. pmid:36989995

26. Yan Tan Hor and Azam Sazuan Nazrah Mohd and Sani Zamani Md and Azizan Azizul. Accuracy study of image classification for reverse vending machine waste segregation using convolutional neural network. International Journal of Electrical and Computer Engineering (IJECE). 2024;14(1):366–374.

27. Krizhevsky Alex and Sutskever Ilya and Hinton Geoffrey E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25.

28. Simonyan, Karen and Zisserman, Andrew. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014.

29. Huang, Gao and Liu, Zhuang and Van Der Maaten, Laurens and Weinberger, Kilian Q. Densely connected convolutional networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017;4700–4708.

30. Szegedy, Christian and Liu, Wei and Jia, Yangqing and Sermanet, Pierre and Reed, Scott and Anguelov, Dragomir, et al. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015;1–9.

31. Howard, Andrew G and Zhu, Menglong and Chen, Bo and Kalenichenko, Dmitry and Wang, Weijun and Weyand, Tobias, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. 2017.

32. Szegedy, Christian and Vanhoucke, Vincent and Ioffe, Sergey and Shlens, Jonathon and Wojna, Zbigniew. Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016;2818–2826.

33. He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016;770–778.

34. Akbar S. and Zou Q. and Raza A. and others. iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks. Artificial Intelligence in Medicine. 2024;151:102860. pmid:38552379

35. Raza A. and Uddin J. and Zou Q. and others. AIPs-DeepEnC-GA: predicting anti-inflammatory peptides using embedded evolutionary and sequential feature integration with genetic algorithm based deep ensemble model. Chemometrics and Intelligent Laboratory Systems. 2024;254:105239.

36. Ullah M. and Akbar S. and Raza A. and others. DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm. Bioinformatics. 2024;40(5):btae305. pmid:38710482

37. Akbar S. and Raza A. and Al Shloul T. and others. PAtbP-EnC: Identifying anti-tubercular peptides using multi-feature representation and genetic algorithm-based deep ensemble model. IEEE Access. 2023;11:137099–137114.

38. Akbar S. and Raza A. and Zou Q. Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model. BMC bioinformatics. 2024;25(1):102. pmid:38454333

39. Raza A. and Uddin J. and Almuhaimeed A. and others. AIPs-SnTCN: Predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks. Journal of chemical information and modeling. 2023;63(21):6537–6554. pmid:37905969

40. Rabano, Stephenn L and Cabatuan, Melvin K and Sybingco, Edwin and Dadios, Elmer P and Calilung, Edwin J. Common garbage classification using mobilenet. 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM). 2018;1–4.

41. Aral, Rahmi Arda and Keskin, Şeref Recep and Kaya, Mahmut and Hacıoğlu, Murat. Classification of trashnet dataset based on deep learning models. 2018 IEEE International Conference on Big Data (Big Data). 2018;2058–2062.

42. Kennedy, Tom. OscarNet: Using transfer learning to classify disposable waste. CS230 Report: Deep Learning. Stanford University, CA, Winter. 2018.

43. Bircanoğlu, Cenk and Atay, Meltem and Besser, Fuat and Genc, Özgün and Kızrak, Merve Ayyuce. RecycleNet: Intelligent waste sorting using deep neural networks. 2018 Innovations in intelligent systems and applications (INISTA). 2018;1–7.

44. Ruiz, Victoria and Sánchez, Ángel and Vélez, José F and Raducanu, Bogdan. Automatic image-based waste classification. From Bioinspired Systems and Biomedical Applications to Machine Learning: 8th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2019, Almería, Spain, June 3–7, 2019, Proceedings, Part II 8. 2019;422–431.

45. Adedeji Olugboja and Wang Zenghui. Intelligent waste classification system using deep learning convolutional neural network. Procedia Manufacturing. 2019;35:607–612.

46. Ozkaya, Umut and Seyfi, Levent. Fine-tuning models comparisons on garbage classification for recyclability. arXiv preprint arXiv:1908.04393. 2019.

47. Shi Cuiping and Xia Ruiyang and Wang Liguo. A novel multi-branch channel expansion network for garbage image classification. IEEE access. 2020;8:154436–154452.

48. Zhang Qiang and Zhang Xujuan and Mu Xiaojun and Wang Zhihe and Tian Ran and Wang Xiangwen, et al. Recyclable waste image recognition based on deep learning. Resources, Conservation and Recycling. 2021;171:105636.

49. Ma Xiaoxuan and Li Zhiwen and Zhang Lei. An improved ResNet-50 for garbage image classification. Tehnički vjesnik. 2022;29(5):1552–1559.

50. Shi Cuiping and Tan Chao and Wang Tian. A waste classification method based on a multilayer hybrid convolution neural network. Applied Sciences, 2021; 11(18): 8572.

51. Hossen Mohammad Mahfuzur and Majid Mohammad Elias and Kashem Shama Binte Anwar and others. A reliable and robust deep learning model for effective recyclable waste classification. IEEE Access, 2024.

52. Alsubaei Faisal S and Al-Wesabi F N and Hilal A M. Deep learning-based small object detection and classification model for garbage waste management in smart cities and IoT environment. Applied Sciences, 2022; 12(5): 2281.

53. Liu Wei and Ouyang Hui and Liu Qi and others. Image recognition for garbage classification based on transfer learning and model fusion. Mathematical Problems in Engineering, 2022; 2022(1): 4793555.

54. Li Yicheng and Liu Wei. Deep learning-based garbage image recognition algorithm. Applied Nanoscience, 2023; 13(2): 1415–1424.

Word count: 8833

Show less

© 2025 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

As people’s material living standards continue to improve, the types and quantities of household garbage they generate rapidly increase. Therefore, it is urgent to develop a reasonable and effective method for garbage classification. This is important for resource recycling and environmental improvement and contributes to the sustainable development of production and the economy. However, existing deep learning-based garbage image classification models generally suffer from low classification accuracy, insufficient robustness, and slow detection speed due to the large number of model parameters. To this end, a new garbage image classification model is proposed, with the ResNet-50 network as the core architecture. Specifically, first, a redundancy-weighted feature fusion module is proposed, enabling the model to fully leverage valuable feature information, thereby improving its performance. At the same time, the module filters out redundant information from multi-scale features, reducing the number of model parameters. Second, the standard 3×3 convolutions in ResNet-50 are replaced with depth-separable convolutions, significantly improving the model’s computational efficiency while preserving the feature extraction capability of the original convolutional structure. Finally, to address the issue of class imbalance, a weighting factor is added to the Focal Loss, aiming to mitigate the negative impact of class imbalance on model performance and enhance the model’s robustness. Experimental results on the TrashNet dataset show that the proposed model effectively reduces the number of parameters, improves detection speed, and achieves an accuracy of 94.13%, surpassing the vast majority of existing deep learning-based waste image classification models, demonstrating its solid practical value.

Details

Title

Enhanced ResNet-50 for garbage classification: Feature fusion and depth-separable convolutions

Author

Li, Lingbo; Wang, Runpu; Zou, Miaojie; Guo, Fusen

; Ren, Yuheng

First page

e0317999

Section

Research Article

Publication year

2025

Publication date

Jan 2025

Publisher

Public Library of Science

e-ISSN

19326203

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1371/journal.pone.0317999

ProQuest document ID

3160322890

Enhanced ResNet-50 for garbage classification: Feature fusion and depth-separable convolutions

Jump to:

Full Text

Introduction

Related works

Garbage image classification method based on machine learning

Garbage image classification method based on convolutional neural network

Proposed methods

Redundancy-weighted feature fusion

Depth-separable convolution

Model loss functions

Experiments

Experimental parameter settings and details

TrashNet dataset

Evaluation index

Experimental results nalysis

Discussion

Conclusion

References

Abstract

Details

Suggested sources