INTRODUCTION
Myocarditis is a cardiac condition caused by viral infection, characterised by inflammation in the middle layer of the heart wall [1]. It remains a significant disease in the field of cardiology, presenting symptoms such as chest pain, fatigue, and shortness of breath [2]. In complex cases, the validation of diagnosis and guidance for treatment often require an invasive procedure known as endomyocardial biopsy [3]. Myocarditis contributes to a substantial number of sudden deaths and affects nearly 20% of individuals under 40 years old [4]. Non-invasive diagnostic methods, such as cardiac magnetic resonance imaging (MRI), are considered effective in identifying suspected cases of myocarditis. MRI also plays a crucial role in diagnosing other heart conditions [5, 6]. However, the effectiveness of CMR can be hindered by the clinical manifestation of the disease and the presence of non-specific symptoms such as chest discomfort, heart failure, and arrhythmia [7–9]. Several factors related to imaging criteria, including technical errors, acquisition parameters, pulse sequences, contrast agent dosage, artefacts, and subjective visual interpretation, can influence disease identification and are susceptible to operator bias [10]. To address the challenges in classifying medical images, automated diagnostic systems that utilise various data mining and machine learning methods have shown promise [11–13]. These systems streamline the image screening process, saving time for physicians, reducing errors, and improving diagnostic accuracy [14]. By leveraging advanced techniques such as deep learning, feature extraction, and pattern recognition algorithms, these automated systems have the potential to enhance the efficiency and reliability of myocarditis diagnosis. They can analyse large volumes of CMR images, extract relevant features, and classify them accurately, providing valuable assistance to medical professionals.
Advanced models have shown remarkable achievements in various applications [15, 16], including natural language processing [17–19] and medical image analysis [20–26]. To minimise the gap between predicted and actual outputs, deep learning algorithms are employed with appropriate weight assignments. Gradient-based backpropagation (BP) techniques are commonly used for weight learning in deep models. However, these optimisation techniques are highly susceptible to the initial values assigned to the weights and may become stuck in local minima [27–29]. This vulnerability often arises during classification tasks [30]. To address this issue, meta-heuristic algorithms [31, 32] have been shown by some researchers to overcome the challenges [33]. Differential Evolution (DE) is a powerful algorithm that has demonstrated effectiveness in addressing various optimisation problems. One of its key advantages in weight initialisation is its ability to handle multimodal optimisation problems where multiple optimal solutions exist. Additionally, the DE algorithm is efficient and has low computational costs, making it suitable for optimising large-scale problems. When compared to other algorithms for weight initialisation, such as random initialisation, genetic algorithms, and particle swarm optimisation, the DE algorithm has shown superior performance in terms of convergence speed and accuracy, especially for complex optimisation problems. Moreover, the DE algorithm has demonstrated successful applications across various domains, including machine learning, image processing, and pattern recognition, highlighting its versatility and efficacy. These advantages make the DE algorithm a promising approach for weight initialisation in deep learning models, where the quality of initial weights significantly impacts model performance [33].
The DE algorithm comprises three main stages: mutation, crossover, and selection. In the mutation stage, a new potential solution is generated by adjusting the scale of differences between existing solutions. During the crossover phase, the recently generated mutation vector is combined with the original vector. The selection stage identifies the top solutions to be carried forward to the next iteration. The mutation operator has a vital role in optimising the DE algorithm's performance. Its primary purpose is to enable the algorithm to explore and discover new regions within the search space, thus avoiding the risk of getting stuck in local optima. A well-designed mutation operator can significantly enhance the algorithm's performance by facilitating a more diverse and extensive search. However, developing an effective mutation operator can be a complex and challenging task, as it requires a deep understanding of both the optimisation algorithm and the problem domain [34]. Therefore, careful consideration and analysis are crucial during the design process to create a mutation operator that is optimised to achieve the best possible results [35].
Class imbalance happens when one category significantly outnumbered the others, negatively impacting the performance of machine learning classification techniques [36–38]. Detecting instances from the minority class can be challenging due to their scarcity and unpredictability, leading to subpar performance. While models may achieve acceptable detection rates for the majority of examples, the detection of minority instances remains a challenge [39, 40]. Additionally, the presence of incompatible samples from the minority class can have destructive consequences. To address this issue, two approaches have been proposed: data-level and algorithmic-level methods [41, 42]. In the data-level approach, class imbalances can be rectified by modifying the training data, such as overrepresenting the underrepresented classes or underrepresenting the overrepresented classes [43]. The synthetic minority oversampling technique (SMOTE) [44] generates new instances by linear interpolation between adjacent minority examples, while NearMiss [45] employs the nearest neighbour algorithm to undersample majority examples. However, both oversampling and undersampling techniques can lead to overfitting or the loss of crucial data. On the algorithmic level, the significance of the lesser represented class can be boosted using methods, such as ensemble learning, decision threshold adjustment, and cost-sensitive learning [46]. Cost-sensitive learning assigns distinct costs to the loss function for each class, with a higher cost for misclassifying the minority class. Ensemble learning utilises multiple sub-classifiers to train the model and improve performance through techniques like combining or voting. Threshold adjustment techniques leverage the class imbalance in the training dataset to train the classifier and modify the decision threshold during testing to enhance classification performance. State-of-the-art techniques based on deep learning have also been recommended for imbalanced data classification [47]. The research article [48] proposed a loss function for deep neural networks that considers errors in classifying both underrepresented and overrepresented classes. In ref. [49], an approach aimed at preserving interclass and intercluster margins while learning distinctive features of an imbalanced dataset was proposed.
To the best of our knowledge, there is a limited number of published studies that have explored the application of deep learning techniques in diagnosing myocarditis. In a study conducted by Sharifrazi et al. [50], a method combining convolutional neural networks (CNNs) with the k-means algorithm was proposed for image classification. The approach involved several steps: starting with data preprocessing, the CNN was employed to classify the images and partition them into multiple clusters, considering each cluster as a distinct class. This process was performed for each cluster separately, and the results were subsequently merged to obtain a final diagnosis. However, an important drawback of the method was its treatment of the image matrix as a vector in the k-means algorithm, which led to the exclusion of certain surrounding pixels around a specific pixel, potentially impacting the accuracy of the results. In another relevant article by Moravvej et al. [51], the authors adopted the ABC algorithm to initialise the weights of the CNN. This alternative approach aimed to enhance the optimisation process in training the CNN model for myocarditis diagnosis. By utilising the ABC algorithm, the weights were initialised more effectively, potentially improving the performance and accuracy of the CNN model.
This article presents a novel model for myocarditis utilising a combination of CNN, RL, and an enhanced DE algorithm. The proposed model treats the classification problem as a sequential decision-making process. In each iteration, the agent receives an environmental state defined by a training sample and, guided by a policy, performs classification. Successful completion of the classification task rewards the agent positively, while an unsuccessful classification results in a negative reward. The class with a lower occurrence rate is rewarded more significantly than the class with a higher occurrence rate. Throughout the sequential decision-making phase, the agent aims to optimise reward accumulation to achieve accurate sample classification. To improve weight initialisation in both the CNN and feed-forward network, a clustering-based enhancement is proposed for the DE algorithm. This enhancement aims to identify a favourable region in the search space, serving as the starting point for the BP algorithm. To achieve this, the mutation operator selects the optimal initial solution from the highest-performing cluster and applies a fresh updating approach to generate solutions. The effectiveness of our model is evaluated on the widely recognised and extensively studied Z-Alizadeh Sani dataset [51] in medical research. This dataset consists of 7135 samples, including 4686 sick samples and 2449 healthy participants. It is meticulously curated and annotated, making it a valuable resource for assessing the performance of machine learning models in diagnosing and predicting medical conditions.
The present study makes four main contributions:
-
•We handle the classification of heart muscle images as a step-by-step decision-making procedure and suggest a reinforcement learning (RL)-based method to tackle the unequal class distribution challenge.
-
•Instead of random weight initialisation, we utilise an encoding approach based on the improved DE algorithm to select optimal initial values.
-
•The Z-Alizadeh Sani myocarditis dataset, obtained from the recently acquired and comprehensively annotated MRI dataset at Omid Hospital in Tehran, is used as the foundation for this study and is publicly available for download.
-
•We carry out tests to assess the efficiency of the suggested model, contrasting it with other methods that employ arbitrary weight initialisation and grapple with the challenges of uneven classification situations.
The rest of the paper is organised as follows: Section II provides a high-level overview of the DE algorithm and RL. In Section III, we introduce our method for myocarditis detection. Experimental results are presented in Section IV, and Section V concludes the paper.
BACKGROUND
Differential evolution
It has been widely proven that group-oriented algorithms, like DE, display effective results across a broad spectrum of optimisation issues [52]. DE starts with an initial population and employs three fundamental operations: mutation, crossover, and selection. In the mutation stage, individuals in the population undergo random perturbations to explore the search space. The crossover operation combines information from different individuals to generate new offspring with potentially improved characteristics. Finally, the selection process determines which individuals from the current population will be retained for the next generation based on their fitness values. The selection step often involves choosing individuals from a uniform distribution, allowing for diverse exploration of the solution space. This approach enhances the algorithm's ability to escape local optima and discover promising regions for optimisation. By incorporating these key operations, DE harnesses the collective intelligence of the population to iteratively improve the quality of solutions and converge towards optimal or near-optimal solutions.
A mutant vector is attained by the mutation operator as follows:
The target and mutant vectors are integrated during the crossover. Binomial crossover is a famous crossover operator that does this as follows:
The superior solution is ultimately chosen from the trial and target vectors by the selection operator.
Reinforcement learning
RL encompasses a range of techniques that enable effective learning from noisy data, leading to meaningful classification outcomes. Wiering et al. [53] described classification as a problem of sequential decision-making, where various components interact with the environment to establish an optimal policy function. However, the complexity of the interactions between the components and the environment resulted in exceptionally long runtime. In the context of noisy text data, Feng et al. [54] proposed a RL-based classification method that employs two classifiers: a sample selector and a relational classifier. The sample selector guides the agent in choosing appropriate phrases from the noisy data, while the relational classifier learns from clean data to provide delayed rewards to the selector for evaluation. This approach yields an exceptional classifier and high-quality dataset. In the domain of time series data, Martinez et al. [55] utilised RL techniques to specify reward criteria and the Markovian process precisely. RL has been successfully applied to train effective features in various applications. These models incentivise the agent to select more valuable features for classification, leading to increased rewards and guiding the agent towards selecting more commendable features. While deep learning has garnered significant attention for imbalanced data classification, there remains a need for further progress in the field.
THE PROPOSED APPROACH
The general architecture of our proposed approach is shown in Figure 1. For the classification problem, we consider two crucial prospects. Initially, we generate a vector that contains all the learnable weights of our model and utilise the advanced DE algorithm to set an initial value for the weights. Then, in the subsequent phase, we employ a RL-based approach for training to address the challenge of imbalanced classification. The next sections go over these approaches in further depth.
[IMAGE OMITTED. SEE PDF]
Pre-training phase
Proper weight initialisation is of paramount importance in ensuring the successful operation of deep networks. Inaccurate initial values can significantly hinder the convergence of the model and impair its overall performance. This article specifically addresses the weight initialisation for two key types of neural networks: CNN and feed-forward neural network. To tackle the weight initialisation challenge, we propose an enhanced DE technique that incorporates a clustering scheme and a novel fitness function. By leveraging the power of clustering, we aim to identify distinct regions within the search space, allowing for a more targeted and effective weight initialisation process. The clustering scheme partitions the population into clusters, each representing a specific section of the search space. This enables us to better explore and exploit different regions, potentially leading to improved convergence and better overall performance. Furthermore, we introduce a novel fitness function that takes into account various factors and metrics to evaluate the quality and suitability of weight initialisation. This comprehensive fitness function aims to guide the DE algorithm towards identifying optimal weight configurations that align with the specific requirements and characteristics of the CNN and feed-forward neural network architectures.
Clustering-based differential evolution
In our improved DE algorithm, we utilise a cluster-based mutation and update mechanism to enhance optimisation performance. The suggested mutation operator, motivated by Mousavirad et al.’s work [56], pinpoints a hopeful area within the search space. The current population P is split into k groups using the k-means clustering method, with each group signifying a separate part of the search space. A random number from the range is chosen to establish the number of clusters. Following the clustering process, the group with the lowest average fitness among its members is deemed the superior cluster.
Building on the existing mechanism of our enhanced DE algorithm, we proceed by focussing on the superior cluster, where the lowest average fitness score indicates the area with the greatest potential for optimisation. This mechanism works by iteratively assessing each cluster's fitness and pinpointing the area in the search space that holds the most promise for improvement. After identifying the superior cluster, the next step is to mutate and update the individuals within this cluster. The mutation operation aims to explore the promising area by creating trial solutions that deviate slightly from the current ones. It is this variation in the cluster that allows for the introduction of potential solutions which might be superior to the current population. The updating scheme is then used to decide which of these trial solutions should be accepted into the population. This decision is typically made based on their fitness: the higher the fitness, the more likely the trial solution will replace a current member of the population. In essence, the clustering-based mutation and update process leads to the population gradually converging towards the most promising region of the search space. This approach not only enhances the optimisation performance but also speeds up the search process as the focus is continually adjusted to the regions that offer the highest potential for improvement.
The proposed clustering-based mutation can be described as follows:
Upon generating M new solutions through cluster-based mutation, the existing population is refreshed, employing a standard population-based algorithm (GPBA). This refresh process plays a crucial role in integrating these new solutions into the population and enhancing the overall solution diversity. The GPBA takes into account the quality of both the existing and newly generated solutions when deciding which to include in the updated population. It employs a selection strategy that favours the most promising solutions based on their fitness, thereby encouraging the population to converge towards optimal or near-optimal solutions. Moreover, the GPBA helps maintain diversity within the population. Diversity is crucial in optimisation algorithms to prevent premature convergence on local optima and to ensure that a broad area of the search space is explored. While the selection strategy favours the fittest solutions, it also ensures that a variety of solutions are retained in the population. This refreshing of the population not only allows the integration of new potential solutions but also promotes the search for even better ones in subsequent iterations. The iterative nature of the GPBA, paired with the clustering-based mutation, ensures that the population evolves over time, getting progressively closer to the optimal solution with each iteration.
The steps are as follows:
-
•Selection: Generate k individuals randomly as initial seeds of k-means algorithm;
-
•Generation: Generate M solutions by employing mutation based on clustering, resulting in a collection of solutions denoted as vclu;
-
•Replacement: Choose M solutions at random and determined as B;
-
•Update: The top M solutions among the set of vclu ∪ B are selected as B′. Subsequently, the new population is computed as the union of (P − B) and B′.
Encoding strategy
The encoding approach employed in our research seeks to organise the CNN and feed-forward weights into a vector that will be used to express the candidate solution in the improved DE. It is challenging to devote precise weights; nonetheless, we have devised an encoding approach that is as pertinent as possible after a few trials. Figure 2 demonstrates a sample of the encoding process using a three-layer CNN network consisting of three filters in each layer and a feed-forward network having three hidden layers. It should be highlighted that in the vector, each weight matrix is preserved as a row.
[IMAGE OMITTED. SEE PDF]
Fitness function
In the enhanced DE algorithm, the effectiveness of a solution is evaluated using the fitness factor, which is calculated as follows:
For the ith sample, the target labels and projected labels are denoted as yi and , respectively, while N represents the total number of instances.
Classification
We confront imbalanced classification challenges due to the disparity in the number of data between our two classes. To tackle this, we fabricate a sequential decision issue utilising RL. In RL, an agent seeks to determine the optimal policy by executing a sequence of actions in the environment with the objective of maximising its performance. Our paper's inspiration for this section arrived from ref. [51]. In our approach, the agent is given an instance of the dataset in every iteration and is demanded to classify it. Subsequently, the instantaneous score is sent to the agent by the environment. An accurate rating equates to a positive score, whilst an incorrect rating leads to a negative score. The optimal policy can be achieved by the agent through the maximisation of cumulative rewards. Let D = (x1, y1), (x2, y2), (x3, y3), …, (xN, yN) denote an imbalanced collection of images with N instances, where xi represents the ith image and yi corresponds to its respective label. The predesignate parameters are as follows:
-
•Policy πθ: A function that maps a collection of states and actions, denoted by S and A respectively, to π is called a policy. To put it another way, each instance of πθ(st) represents taking an action at while being in state st. The model for classification with weights θ is referred to as πθ.
-
•State st: The data point xt is mapped to a state st in the dataset D, where the first state s1 corresponds to the initial data point x1. The D is interfused in each iteration so the model does not memorise the predetermined sequence.
-
•Action at: To anticipate the label xt, action at is accomplished. Because the presented classification is binary, the numbers of at ∈ {0, 1} signify the minority and majority classes, respectively.
-
•Reward rt: The performance of an action is taken into account while deciding on a reward. The agent is rewarded positively when it correctly classifies the input sample; otherwise, a negative reward is given. The bonus amount for both categories should not be the same. Because the amount of reward and action has been configured properly, rewards can considerably increase model performance. The reward of the action is specified in this research using the equation below [57]:
In this context, DH and DS represent the minority and majority classes, respectively, while λ is a value within the range of [0,1]. Since the minority class is critical due to the scarcity of data, the reward λ is assigned a value less than 1/−1. Indeed, we can give the minority class more priority to approach the majority class. We will see the significance of the value λ in the results section.
-
•Terminal E: Multiple final states are reached during each training session, signifying the completion of the training process. The sequence of state-action-reward transitions starting from an initial state and ending in a final state within each session is represented as (s1, a1, y1), (s2, a2, y2), (s3, a3, y3), …, (st, at, yt). Our approach terminates a training session under two conditions: either all training data is classified accurately or an instance from the minority class is classified incorrectly.
-
•Transition probability P: The transition of the agent from state st to state st+1 is determined by the order in which the data is read, where the probability of this transition is denoted by p(st+1|st, at).
EMPIRICAL EVALUATION
Dataset
A study was carried out on myocarditis in Tehran using CMR techniques, spanning a duration of one year from September 2016 to September 2017. Throughout the course of the research project, patients who showed clinical indications of myocarditis but were not conclusively diagnosed underwent CMR scans. The medical practitioner determined that the CMR results would have an impact on their clinical management. The research protocol obtained ethical approval from the local committee. A 1.5 T system was used for the CMR investigation [58, 59].
To identify patients with positive evidence of myocarditis, CMR scans of 586 patients were utilised, which might indicate one or more disease regions. A total of 307 healthy people were explored to establish balance. To analyse the data, a total of eight CMR images were acquired from each control subject or patient, consisting of one short-axis image and one long-axis image obtained using four different CMR sequences: late gadolnium enhancement, perfusion, T2-weighted, and steady-state free precession. The finalised dataset consists of 4686 sick samples and 2449 healthy participants, respectively. Figure 3 illustrates images that were generated employing this dataset. It is noteworthy that image-level analysis is accomplished in this approach, not patient-level. To put it another way, regardless of how many photos are available for each patient, prediction is based on a single image. In essence, single image prediction is utilised, not patient-based.
[IMAGE OMITTED. SEE PDF]
Details of model
Python and the Pytorch framework were utilised in this project. In a Jupyter notebook, the codes are written. For CNN, we utilised a total of five tiers of two-dimensional convolution incorporating 256, 128, 64, 32, and 16 filters, respectively. Each layer's padding, stride, and kernel sizes are 3, 2, and 1, sequentially, for both dimensions. A max-pooling layer with a size of 2 × 2 is contained in each convolution layer. Each of the three fully connected layers has hidden layers of 256, 128, and 64. To avoid overfitting, early stopping and dropout with a probability of 0.5 are utilised. The batch size is incessantly 64 in each epoch. Grayscale images with an image pixel light severity of between [0,1] are employed in this dataset. The dataset comprises images of various sizes that have been resized to a resolution of 100 × 100 for the purpose of analysis.
Experimental results
In the classification procedure to appraise the offered strategy, six standard performance metrics, namely Accuracy, Recall, Precision, F-measure, Specificity, and G-means are employed [60]. These metrics act as numerical indicators for assessing the efficiency and resilience of our classification model.
To evaluate the performance and robustness of our models, we employed a k-fold (k = 5) stratified cross-validation approach in all of our experiments. This technique ensures that the dataset is divided into k equal-sized and representative categories or folds. During each iteration of the cross-validation process, one fold is held out for testing, while the remaining k − 1 folds are used for training the model. This process is repeated k times, with each fold serving as the test set once. By following this rigorous and systematic evaluation procedure, we ensure that all data samples are utilised both for training and testing, and that the performance metrics are averaged over multiple iterations, providing a reliable estimation of the model's performance. The use of stratified cross-validation is particularly important in scenarios where the dataset exhibits class imbalance or when it is crucial to maintain the distribution of class labels across the folds. By stratifying the data, we ensure that each fold contains a representative proportion of samples from each class, minimising the potential bias that could arise from an imbalanced distribution. This approach is especially relevant in medical datasets, where the occurrence of rare conditions or minority classes may be of critical importance. By employing k-fold stratified cross-validation, we can obtain robust performance estimates for our models, as the evaluation is based on multiple diverse subsets of the data. This allows us to assess the generalisation capability of the models and identify any potential overfitting or underfitting issues. Additionally, by averaging the performance metrics across the k folds, we obtain a more reliable estimation of the model's overall performance, reducing the impact of random variations that may occur when using a single train-test split.
First, we contrasted the two published studies in this field, discoursed as CNN-KCL and RLMD-PA, with our proposed method. In order to assay the two separate components of improved DE and RL in our model, a fundamental model performance without improved DE and RL, that is, MD + random weight, is compared to the models MD + IDE and MD + RL, which employ improved DE and RL for training. Table 1 exhibits the assessment outcomes of our proposed approach performance with the other model mentioned above on the Z-Alizadeh Sani myocarditis dataset. The presented model decreases the error by more than 32%. The presented approach outperforms the CNN-KCL method and MD + random weight, MD + IDE, and MD + RL combinations of its components, from the point of view of the mean of all the performance metrics. Across all measured performance criteria, the improved DE and RL both outperform the basic CNN network, which supports the employment of combined initial weight and RL techniques. The best model was determined after 122 iterations, which spent 0.5 h, whereas CNN-KCL and RLMD-PA discovered the best model after 283, and 157 iterations, which lasted 1.5, and 2 h, respectively.
TABLE 1 Results of deep learning and traditional algorithms on the Z-Alizadeh Sani myocarditis dataset.
Accuracy | Recall | Precision | F-measure | Specificity | G-means | |
CNN-KCL | 0.804 ± 0.015 | 0.744 ± 0.033 | 0.735 ± 0.036 | 0.740 ± 0.030 | 0.838 ± 0.027 | 0.792 ± 0.028 |
RLMD-PA | 0.878 ± 0.019 | 0.845 ± 0.016 | 0.829 ± 0.035 | 0.837 ± 0.017 | 0.896 ± 0.026 | 0.869 ± 0.015 |
SVM | 0.727 ± 0.023 | 0.786 ± 0.029 | 0.606 ± 0.024 | 0.684 ± 0.026 | 0.692 ± 0.02 | 0.737 ± 0.023 |
KNN | 0.718 ± 0.100 | 0.706 ± 0.020 | 0.631 ± 0.151 | 0.660 ± 0.090 | 0.724 ± 0.150 | 0.713 ± 0.082 |
Naïve Bayes | 0.699 ± 0.017 | 0.782 ± 0.039 | 0.572 ± 0.016 | 0.661 ± 0.023 | 0.649 ± 0.013 | 0.712 ± 0.020 |
Logistic Regression | 0.666 ± 0.016 | 0.679 ± 0.015 | 0.545 ± 0.018 | 0.605 ± 0.017 | 0.659 ± 0.018 | 0.669 ± 0.016 |
Random forests | 0.582 ± 0.022 | 0.665 ± 0.033 | 0.461 ± 0.020 | 0.544 ± 0.024 | 0.531 ± 0.022 | 0.594 ± 0.022 |
MD + random weight | 0.762 ± 0.035 | 0.697 ± 0.045 | 0.679 ± 0.047 | 0.687 ± 0.046 | 0.801 ± 0.031 | 0.747 ± 0.038 |
MD + IDE | 0.871 ± 0.032 | 0.864 ± 0.044 | 0.832 ± 0.040 | 0.838 ± 0.031 | 0.885 ± 0.016 | 0.876 ± 0.030 |
MD + RL | 0.893 ± 0.025 | 0.870 ± 0.042 | 0.851 ± 0.049 | 0.860 ± 0.032 | 0.910 ± 0.019 | 0.896 ± 0.034 |
Proposed (IDE + RL) | 0.910 ± 0.019 | 0.889 ± 0.038 | 0.875 ± 0.043 | 0.883 ± 0.029 | 0.926 ± 0.021 | 0.919 ± 0.027 |
For our project, we employed a powerful computing setup consisting of a 64-bit Windows operating system with 64 GB of RAM and a GPU, enabling us to efficiently train and evaluate our models. During the training phase, we observed that the best performing model for CNN-KCL, RLMD-PA, and the proposed model was achieved after 262, 230, and 185 epochs, respectively. This indicates that the models underwent an iterative learning process, gradually improving their performance over multiple training iterations. In terms of computational time, the training process for CNN-KCL, RLMD-PA, and the proposed model took approximately 2, 3.5, and 2.5 h, respectively.
Conventional machine learning methods have limitations when it comes to recognising medical images due to their approach of viewing images as one-dimensional vectors and separating neighbouring pixels. To address this challenge, we conducted an evaluation of five different classification methods, including support vector machine (SVM) [61], k-nearest neighbour [62], naive Bayes [63], logistic regression [64], and random forests [65]. The purpose of this evaluation was to determine the effectiveness of these methods in classifying the CMR images and compare them against the performance of our deep model. After careful analysis, we found that while SVM outperformed the other approaches, the deep model still exhibited superior performance in classifying the medical images. The deep model utilised advanced techniques in deep learning, allowing it to capture complex patterns and dependencies present in the CMR images. By leveraging its multiple layers of interconnected neurons, the deep model could automatically learn hierarchical representations that capture intricate features at different levels of abstraction. This ability to extract and analyse intricate features played a crucial role in achieving superior classification performance. In contrast, the standard machine learning classifiers treated the images as one-dimensional vectors, disregarding the spatial relationships between neighbouring pixels. As a result, they were unable to capture the intricate details and contextual information crucial for accurate classification of medical images. Table 1 provides a summary of our observations from the evaluation, highlighting the performance of each method. Although SVM demonstrated relatively better results compared to the other methods, it still fell short when compared to the deep model's classification accuracy. These findings underscore the significance of leveraging deep learning approaches in medical image recognition tasks. The ability of deep models to learn complex representations and capture spatial dependencies within images makes them well-suited for analysing and classifying medical data. The continuous advancements in deep learning techniques and the availability of large-scale medical image datasets hold great potential for further improving the accuracy and reliability of medical image analysis and diagnosis.
Explore other metaheuristic algorithms
The proposed model incorporates an enhanced DE approach for weight initialisation, which has a significant impact on the subsequent BP process. To evaluate the effectiveness of this enhanced DE approach, we compared it against five conventional algorithms, namely momentum BP (GDM) [66], gradient descent with adaptive learning rate BP (GDA) [67], gradient descent with momentum and adaptive learning rate BP (GDMA) [68], one-step secant BP (OSS) [69], and Bayesian regularisation BP (BR) [70]. Additionally, we included four meta-heuristic algorithms in the comparison, specifically Grey Wolf Optimisation (GWO) [71], the bat algorithm (BA) [72], Cuckoo Optimisation Algorithm (COA) [73], and the original DE [74]. To provide a comprehensive evaluation, we established additional parameter settings as presented in Table 2. The performance measures used for comparing the algorithms are outlined in Table 3. It was observed that the meta-heuristic algorithms consistently outperformed the conventional algorithms in terms of accuracy, recall, and F-measure scores. Notably, the enhanced DE algorithm exhibited superior performance compared to both the traditional and meta-heuristic algorithms, achieving a remarkable reduction in recall and F-measure errors of over 30% and 27%, respectively. These results highlight the effectiveness of the proposed enhanced DE approach in improving the classification performance for the given task.
TABLE 2 Parameter setting for research.
Algorithm | Parameter | Value |
GWO | Without parameters | |
BAT | Constant for loudness update | 0.50 |
A constant for updating the emission rate | 0.50 | |
Initial pulse emission rate | 0.001 | |
COA | Alien solutions' rate of discovery | 0.25 |
DE | Scaling factor | 0.5 |
Crossover probability | 0.8 |
TABLE 3 Results of various optimisation methods on the Z-Alizadeh Sani myocarditis dataset.
Accuracy | Recall | Precision | F-measure | Specificity | G-means | |
GDM | 0.829 ± 0.021 | 0.804 ± 0.019 | 0.793 ± 0.039 | 0.804 ± 0.036 | 0.878 ± 0.029 | 0.841 ± 0.024 |
GDA | 0.841 ± 0.019 | 0.811 ± 0.024 | 0.781 ± 0.043 | 0.794 ± 0.021 | 0.864 ± 0.031 | 0.839 ± 0.017 |
GDMA | 0.857 ± 0.014 | 0.810 ± 0.032 | 0.804 ± 0.030 | 0.809 ± 0.035 | 0.879 ± 0.028 | 0.849 ± 0.028 |
OSS | 0.845 ± 0.011 | 0.800 ± 0.023 | 0.794 ± 0.021 | 0.797 ± 0.026 | 0.876 ± 0.013 | 0.839 ± 0.024 |
BR | 0.833 ± 0.010 | 0.783 ± 0.010 | 0.780 ± 0.042 | 0.786 ± 0.017 | 0.869 ± 0.036 | 0.827 ± 0.003 |
GWO | 0.856 ± 0.019 | 0.802 ± 0.023 | 0.799 ± 0.024 | 0.801 ± 0.024 | 0.879 ± 0.016 | 0.842 ± 0.019 |
BAT | 0.855 ± 0.021 | 0.798 ± 0.029 | 0.810 ± 0.019 | 0.801 ± 0.018 | 0.885 ± 0.012 | 0.833 ± 0.018 |
COA | 0.849 ± 0.010 | 0.806 ± 0.006 | 0.789 ± 0.032 | 0.790 ± 0.045 | 0.862 ± 0.039 | 0.837 ± 0.031 |
DE | 0.893 ± 0.014 | 0.871 ± 0.020 | 0.852 ± 0.031 | 0.870 ± 0.019 | 0.903 ± 0.026 | 0.901 ± 0.017 |
Explore the reward function
Rewards for correct and incorrect classifications are allocated to the majority and minority classes as ±1 and ±λ, respectively. The choice of λ is intrinsically tied to the proportion of majority to minority samples. As this ratio augments, it is expected that the optimal λ value will inversely decline. To delve into the impact of λ, we subjected the proposed model to various λ values, spanning from 0 to 1 in increments of 0.1, all the while preserving a steady reward for the majority class. These results are vividly captured in Figure 3. When λ is fixed at 0, the importance of the majority class is virtually nullified, whereas at λ = 1, the consequences on both majority and minority classes become balanced. As delineated in Figure 4, the performance of the model reaches its zenith at a λ value of 0.4 across all the evaluated metrics, intimating that the most advantageous λ value resides neither at the absolute values of 0 or 1, but rather in an intermediary position. It is vital to note that, although it is essential to mitigate the influence of the majority class by adjusting λ, opting for a value that is excessively low may impinge on the overall efficacy of the model. The gathered data emphasises that the choice of λ markedly influences the performance of the model. The optimal λ value is dependent on the relative proportions of majority to minority samples, necessitating its careful selection for achieving the most favourable outcomes.
[IMAGE OMITTED. SEE PDF]
Impact of loss function
In addressing the issue of skewed data distribution within SA, the utilisation of various conventional methodologies, such as the adaptation of data augmentation techniques and loss functions, is essential. Of these methodologies, the selection of an apt loss function is particularly significant as it can effectively accentuate the importance of the minority class. In this study, we conducted a rigorous evaluation of five distinct loss functions, namely Weighted Cross-Entropy (WCE) [75], Balanced CrossEntropy (BCE) [76], Focal Loss (FL) [77], Dice Loss (DL) [78], Tversky Loss (TL) [79]. These were assessed in relation to our proposed model. Notably, the BCE and WCE loss functions ensure unbiased consideration by assigning equivalent weightage to both positive and negative samples. In this assortment of functions, the FL function has proven to be an efficient tool for applications wrestling with imbalanced data. This is achieved by its judicious weight distribution; wherein lesser weight is attributed to simpler examples and higher weight to complex ones. This strategic allocation emphasises the challenging, often underrepresented cases without downplaying the simpler ones. As evidenced in Table 4, the FL function, when compared to TL, generates a lower error rate, with the reduction ranging between 15% and 29% across accuracy and F-measure metrics. This remarkable reduction exhibits the FL's competence in managing imbalanced data more effectively. However, notwithstanding these promising outcomes, it is vital to recognise that the performance of the FL function pales in comparison to RL, trailing behind by a significant 35% margin. This implies that while the FL function can be a feasible option for handling imbalanced data, more sophisticated techniques like RL may offer more precise results.
TABLE 4 Performance metrics of various loss functions for myocarditis diagnosis on the Z-Alizadeh Sani myocarditis dataset.
Accuracy | Recall | Precision | F-measure | Specificity | G-means | |
WCE | 0.845 ± 0.003 | 0.778 ± 0.013 | 0.803 ± 0.021 | 0.790 ± 0.005 | 0.885 ± 0.021 | 0.830 ± 0.015 |
BCE | 0.822 ± 0.001 | 0.824 ± 0.024 | 0.745 ± 0.000 | 0.783 ± 0.031 | 0.821 ± 0.035 | 0.822 ± 0.006 |
FL | 0.868 ± 0.026 | 0.833 ± 0.027 | 0.819 ± 0.006 | 0.826 ± 0.027 | 0.889 ± 0.036 | 0.861 ± 0.026 |
DE | 0.811 ± 0.024 | 0.837 ± 0.026 | 0.711 ± 0.009 | 0.769 ± 0.031 | 0.795 ± 0.001 | 0.816 ± 0.033 |
TL | 0.827 ± 0.013 | 0.816 ± 0.020 | 0.747 ± 0.007 | 0.780 ± 0.010 | 0.834 ± 0.026 | 0.825 ± 0.024 |
Discussion
The article focused on developing an approach for myocarditis detection using CMR images with the help of CNN, an improved DE algorithm for pre-training, and a RL-based model for training. One of the significant challenges encountered in this study was the imbalanced classification of the Z-Alizadeh Sani myocarditis dataset. To address this issue, the training process was designed as a sequential decision-making process, where the agent received rewards or penalties based on accurately or inaccurately classifying the minority or majority class. This approach aimed to enhance the model's ability to handle imbalanced data and achieve higher accuracy. Additionally, the article proposed an enhanced DE algorithm to initiate the BP process, effectively resolving the initialisation sensitivity problem commonly faced in gradient-based methods during training. Experimental results based on standard performance metrics demonstrated the effectiveness of the proposed model in diagnosing myocarditis.
To address the potential limitations of the suggested model, further investigations and improvements can be made in several key areas. Firstly, the model's reliance on a single dataset, the Z-Alizadeh Sani myocarditis CMR dataset, may limit its generalisability to other datasets with distinct features. To ensure the broader applicability of the model, it is crucial to evaluate its performance on independent datasets representing different populations, imaging protocols, and myocarditis characteristics. This comprehensive evaluation will provide insights into the model's robustness and its ability to adapt to diverse scenarios, enhancing our confidence in its effectiveness across various settings. Moreover, it is important to acknowledge that the model was developed using a retrospective study design, which has inherent limitations and potential biases. Conducting prospective studies, where data is collected prospectively and a clear protocol is established, would be more suitable to validate the model's effectiveness in myocarditis diagnosis. Prospective studies offer a stronger basis for establishing cause-effect relationships and minimise the potential biases associated with retrospective data analysis. On the technical side, the quality of input CMR images can influence the model's performance. Factors such as imaging techniques, scanner parameters, and patient-specific characteristics can introduce variability in image quality. It is necessary to account for these variations and develop techniques to standardise or enhance image quality during the training and testing phases of the model. By addressing image quality issues, the model can better detect and classify myocarditis accurately, regardless of the variations in image acquisition. Additionally, the quantity and characteristics of myocarditis lesions can vary across different patients, posing a challenge for the model in handling lesions of different sizes, shapes, and locations. To improve the model's performance, it would be beneficial to train it on a diverse range of myocarditis cases, encompassing variations in lesion characteristics. Augmentation techniques, such as data augmentation or incorporating additional annotated datasets, can be employed to expand the diversity of training samples and improve the model's ability to handle lesion variability. In addition to the aforementioned considerations, future research efforts can explore several other crucial areas to overcome the limitations and further enhance the proposed model. Firstly, it would be beneficial to investigate the generalisability and transferability of the model by evaluating its effectiveness on a wider array of datasets, particularly those with a lower incidence of myocarditis. This expanded evaluation would enable a comprehensive assessment of the model's performance under more challenging conditions, where distinguishing myocarditis cases from other cardiac abnormalities or healthy cases may prove to be more difficult. By subjecting the model to diverse datasets, its robustness, versatility, and potential limitations can be thoroughly examined, enhancing our understanding of its real-world applicability and performance across varying prevalence rates of myocarditis. Secondly, future research endeavours can prioritise the development of advanced deep learning segmentation techniques that not only detect the presence of myocarditis but also precisely delineate the specific location and intensity of the condition on CMR images. By delving into the realm of segmentation, the model can provide detailed insights into the spatial extent and severity of myocarditis, empowering medical professionals with invaluable information for more precise diagnosis and treatment planning. This necessitates the creation of sophisticated segmentation algorithms that effectively exploit the rich information embedded within CMR images, enabling accurate localisation and quantification of myocarditis lesions. Such advancements would significantly contribute to the realisation of more precise and comprehensive cardiac healthcare practices, improving patient outcomes and guiding targeted therapeutic interventions.
CONCLUSION AND FUTURE DIRECTIONS
Myocarditis is a serious cardiovascular condition that can have significant consequences if not detected early and treated promptly. The proposed method in this article leverages advanced techniques, including CNN, an improved DE algorithm, and an RL-based algorithm, to address the challenges associated with identifying myocarditis using CMR images. CNNs are a type of neural network specifically designed to extract important features from images, making them well-suited for image recognition tasks. The improved DE algorithm is utilised to pre-train the CNNs, enhancing their performance prior to the application of the deep RL-based algorithm. One of the main challenges encountered in developing this approach is the imbalanced classification of the Z-Alizadeh Sani myocarditis dataset, where the majority of cases are normal and only a small fraction are abnormal. To tackle this issue, the proposed method adopts a sequential decision-making process where the agent receives rewards or penalties based on the accurate or inaccurate classification of the minority or majority class. This approach helps overcome the inherent imbalances commonly found in medical image analysis datasets. Additionally, a novel DE algorithm is introduced, incorporating a clustering-based mutation operator to initiate the back-propagation process and address the sensitivity of initialisation in gradient-based methods. The effectiveness of the proposed model in detecting myocarditis is demonstrated through experimental results using standard performance measures. The findings indicate that the proposed method outperforms other techniques in accurately categorising myocarditis images.
To further enhance the efficacy of the proposed method in detecting myocarditis, several potential future directions can be explored. One avenue is to investigate the use of other types of neural networks, such as recurrent neural networks (RNNs) or CNNs with attention mechanisms, to improve the accuracy of the model. Another area of research could involve exploring transfer learning techniques to adapt the pre-trained CNNs to new datasets with different characteristics. This would be particularly valuable in scenarios where obtaining large labelled datasets is challenging, such as in rare or specialised medical conditions. Furthermore, other optimisation methods could be explored to enhance the pre-training of the CNNs, potentially leading to improved feature extraction and higher classification accuracy. Additionally, alternative reward functions, such as entropy-based rewards, could be investigated for the RL-based algorithm to better address the challenges posed by imbalanced datasets. Lastly, the utilisation of multimodal imaging data, such as combining CMR images with other modalities like computed tomography (CT) or ultrasound, could be explored to enable a more comprehensive and accurate diagnosis of myocarditis.
CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest.
DATA AVAILABILITY STATEMENT
The dataset used to support the findings of this study is available on GitHub: .
Danaei, S., et al.: Myocarditis diagnosis: a method using mutual learning‐based ABC and reinforcement learning. In: 2022 IEEE 22nd International Symposium on Computational Intelligence and Informatics and 8th IEEE International Conference on Recent Achievements in Mechatronics, Automation, Computer Science and Robotics (CINTI‐MACRo), pp. 265–270. IEEE (2022)
Zhou, L., et al.: Usefulness of enzyme‐free and enzyme‐resistant detection of complement component 5 to evaluate acute myocardial infarction. Sensor. Actuator. B Chem. 369, [eLocator: 132315] (2022). [DOI: https://dx.doi.org/10.1016/j.snb.2022.132315]
Xie, S., Tu, Z.: Holistically‐nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)
Lin, T., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Sudre, C., et al.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 240–248 (2017)
Sadegh Mohseni Salehi, S., Erdogmus, D., Gholipour, A.: Tversky Loss Function for Image Segmentation Using 3D Fully Convolutional Deep Networks. ArXiv E‐prints (2017).arXiv‐1706
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Myocarditis is a serious cardiovascular ailment that can lead to severe consequences if not promptly treated. It is triggered by viral infections and presents symptoms such as chest pain and heart dysfunction. Early detection is crucial for successful treatment, and cardiac magnetic resonance imaging (CMR) is a valuable tool for identifying this condition. However, the detection of myocarditis using CMR images can be challenging due to low contrast, variable noise, and the presence of multiple high CMR slices per patient. To overcome these challenges, the approach proposed incorporates advanced techniques such as convolutional neural networks (CNNs), an improved differential evolution (DE) algorithm for pre‐training, and a reinforcement learning (RL)‐based model for training. Developing this method presented a significant challenge due to the imbalanced classification of the Z‐Alizadeh Sani myocarditis dataset from Omid Hospital in Tehran. To address this, the training process is framed as a sequential decision‐making process, where the agent receives higher rewards/penalties for correctly/incorrectly classifying the minority/majority class. Additionally, the authors suggest an enhanced DE algorithm to initiate the backpropagation (BP) process, overcoming the initialisation sensitivity issue of gradient‐based methods like back‐propagation during the training phase. The effectiveness of the proposed model in diagnosing myocarditis is demonstrated through experimental results based on standard performance metrics. Overall, this method shows promise in expediting the triage of CMR images for automatic screening, facilitating early detection and successful treatment of myocarditis.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia
2 Centre for Artificial Intelligence Research (CAIR), Department of Information and Communication Technology, University of Agder, Grimstad, Norway
3 Department of Creative Technologies, Air University, Islamabad, Pakistan
4 School of Information and Communication Engineering, Hainan University, Haikou, Hainan, China
5 Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, Victoria, Australia
6 Data Science and Computational Intelligence Institute, University of Granada, Granada, Spain