Correspondence to Dr May Y Choi; [email protected]
WHAT IS ALREADY KNOWN ON THIS TOPIC
Most machine learning models developed for SLE to date have been directed towards elucidating disease pathogenesis, improving diagnosis, and predicting disease-related outcomes.
WHAT THIS STUDY ADDS
This study provides an overview of machine learning techniques to discuss current gaps, challenges, and opportunities for SLE research.Most SLE machine learning studies under-report key details of the model development and/or have not been externally validated to ensure they are effective, reliable, and safe to adopt into clinical practice.
The application of more advanced machine learning algorithms such as deep learning and the utilisation of complex, alternative datasets including images, are increasing among SLE studies.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
As machine learning continues to provide unprecedented opportunities to deliver transformative discoveries in SLE research and practice, researchers need to stay informed of the ethical, governance, and regulatory considerations around their use.
Introduction
Tremendous progress in our understanding of SLE pathogenesis, diagnosis and management has been made over the past 75 years, with most studies relying on traditional statistical techniques to evaluate and test hypotheses. While these approaches are still widely used, many researchers are turning to machine learning (ML) as a complementary method for assessing patterns that are not readily tested using traditional statistics. In the last 5 years alone, there has been an explosion of studies that have leveraged the power of ML to study SLE patient identification, risk prediction, diagnosis, disease subtype classification, progression, outcomes, monitoring and management. While it may seem that ML is the new shiny toy of the 21st century, the term ‘artificial intelligence’ (AI) was first described in 1955, the same year that antimalarial drugs were approved by the US Food and Drug Administration. The impact of AI on medicine has recently re-emerged as a valuable approach because of the enormous growth in computing power and increasing availability of extensive and comprehensive ‘big data’ for analysis. As SLE researchers continue to amass more data on SLE, a complex, multifactorial and heterogeneous disease, traditional statistical techniques may no longer be the most effective or efficient methods, particularly in this era focused on precision medicine. In this review, we will provide an overview of ML and its current and future potential applications to SLE research.
Why ML in SLE?
Although ML and AI are often used interchangeably, ML is a subset of AI (figure 1). AI is the development of machines and systems that can imitate tasks that normally require intelligent human behaviour. ML algorithms allow computers to perform specific tasks by learning from the data rather than being explicitly programmed with instructions such as traditional statistical tests. Some other important differences between ML and traditional statistics are described in table 1. Understanding the advantages and disadvantages of both approaches may help inform one’s decision on which methods to use. In general, if the purpose of a project is to create an algorithm that can make predictions for a particular outcome and a large dataset is available, an ML approach may be a better option. If the purpose is to examine a relationship between variables or make inferences from a smaller dataset, then a traditional statistical model may be the better approach.
Figure 1. Categories of machine learning. Machine learning is a type of artificial intelligence. Within machine learning, there are three main categories: supervised, unsupervised and reinforcement learning. Deep learning is a subtype of machine learning that can involve supervising, unsupervised or reinforcement learning. Within each category, there are many different types of machine learning algorithms. Many factors can influence the choice of a specific algorithm. These include amount and type of data (eg, if images or videos are included in the data, a neural network will probably be preferred); how important interpretability is to your context (eg, decision trees or regression models are typically more interpretable, although this is an active area of research); and any computer memory or computational restriction. As no particular model consistently performs better than the others, it is typical to develop several models using multiple algorithms and then compare their performance using different metrics. CNN, convolutional neural network; CVD, cardiovascular disease; LASSO, least absolute shrinkage and selection operator; RNN, recurrent neural network.
Key differences in machine learning and traditional statistical approaches
Machine learning | Traditional statistics |
Large dataset. | Small to mid-sized dataset. |
Low interpretability. | High interpretability. |
Can include 10s–1000s of variables in a single model. | Limited in number of variables included in a single model (<10). |
Certain models have considerable flexibility around data distribution and can model many different types of non-linear relationships. | Often assumes normality of data and/or linear relationships. |
Can tailor which performance parameter to maximise (accuracy, sensitivity, etc). | Maximises accuracy. |
Uses a portion of the data to develop the model. | Uses all available data to develop the model. |
Can capture patterns across multiple variables. | Limited ability to capture patterns across multiple variables (interaction terms). |
Can adjust the generalisability/penalty for the coefficients to prevent overfitting. | Unable to adjust the generalisability/penalty for the coefficients to prevent overfitting. |
In this technological age, researchers have greater access to large datasets of different types of information on patients with SLE. Types of datasets in SLE include demographic, clinical, histological, genetic and immune-related biomarkers (eg, autoantibodies, immune cell types, cytokines) in biological fluids, electronic medical records (EMR), images (eg, MRI, ultrasound) and other ‘omics’ (eg, proteomics, metabolomics). While this presents an important opportunity to study a remarkably heterogeneous and complex disease like SLE, the volume and density of data can also make it challenging to draw statistical inferences from large datasets, especially given the potential to identify false positive associations. Hence, ML is a more efficient and accurate approach to understanding the patterns in complex datasets.
The more technical aspects of ML as they apply to systemic autoimmune rheumatic diseases are reviewed in greater detail elsewhere.1–5 In brief, the ML categories that are often applied to study medical data are supervised and unsupervised. In supervised ML, or a task-driven approach, a ‘training dataset’ is used to develop an algorithm to recognise patterns that are associated with ‘labels’. This algorithm is then tested in a ‘test dataset’ to see how well it performs. In unsupervised ML or a data-driven approach, the training data are ‘unlabeled’, and the algorithm attempts to identify patterns within the dataset. In addition to supervised and unsupervised ML, another less commonly applied type of ML is called reinforcement learning. This type of ML is based on trial and error, with ‘reward’ or ‘punishment’ driving the learning process and skills acquisition. Within the three ML categories, a variety of ML algorithms exist, such as deep learning algorithms based on artificial neural networks (ANN), a modality that involves multiple layers of connected data, which can recognise complex patterns across different types of data, including images, video, and acoustic data. As we will discuss later, most ML studies in SLE employ both supervised and unsupervised models.
To determine which ML model to use, researchers consider several important factors including the characteristics of the input data (labelled vs unlabelled), the desired outcome (predicting a category or quantity), the modality of the data (eg, text, image) and volume of input data (figure 1). It is common to employ several algorithms and then compare their performance using different metrics to select the best model. For supervised models, it is ideal to assess the sensitivity, specificity, positive predictive value, negative predictive value, accuracy, F-score and area under the receiver operating characteristic curve (AUC), although particular emphasis may be placed on a subset of metrics depending on the context. The F-score is a single metric that combines the sensitivity and the positive predictive value of a model, and a high F-score requires good performance by both of those metrics. In traditional statistics, this is often referred to as the accuracy or line of best fit, but with ML, the F-score may be better suited to assess the training of a model. For unsupervised clusters, several techniques exist to ensure the number of identified clusters accurately reflects the data. These include the elbow method,6 the Bayes information criterion7 or a gap statistic.8 Once satisfied, statistical differences between clusters can be assessed using traditional methods, such as χ2 tests or analysis of variance.
Building and evaluating the ML models occur as the final steps of an established ML pipeline (figure 2). After the data are collected, it is preprocessed (data cleaning, filling in missing data, etc), followed by data splitting, feature importance evaluation and selection, and then finally the ML models are built and evaluated. Feature selection is a process that allows the researcher to identify the best set of features that will help build optimised ML models (reviewed in ref 9). Feature selection is typically used with supervised algorithms, while dimensionality reduction is used in unsupervised clustering. Reports often use multiple supervised and unsupervised feature selection methods together. Examples of feature selection include recursive feature elimination,10 least absolute shrinkage and selection operator (LASSO)11, and support vector machines (SVM).12 These methods help identify covariables that are of greatest clinical and statistical importance.
Figure 2. Machine learning pipeline with consideration of the ethical, governance and regulation issues at every stage before clinical adoption of the model.
ML reports in SLE
A scoping review was performed to summarise the major ML reports of SLE to date. A PubMed search of ‘lupus’ and ‘machine learning’ Medical Subject Heading terms was performed on 24 November 2023 (figure 3). One hundred and ninety-one publications from 1992 to 2023 were identified, of which 133 were original reports. The remaining publications were review articles or unrelated topics (eg, not SLE, non-human, not ML). Over the last 31 years, there has been an exponential increase in the number of ML and SLE-related publications, similar to trends reported in other autoimmune rheumatic diseases.1 5 13 As this was not a systematic review, we acknowledge that we may have omitted some studies related to ML and SLE. However, we believe that we have captured most publications allowing for an accurate representation of the field and an in-depth discussion in our paper.
Figure 3. Number of SLE-related studies using machine learning methods. There has been an exponential growth of reports over the past 31 years based on PubMed database of publications when we searched ‘machine learning’ and ‘lupus’. The majority of reports were related to diagnosis (including neuropsychiatric and dermatological manifestations), followed by disease activity (including renal flares, extrarenal flares and treatment response), complications, pathogenesis, and mixed reports.
As ML research becomes increasingly recognised and valued in SLE, it is imperative that it is conducted in a methodologically rigorous manner to yield meaningful and useful results to relevant stakeholders and end users. Since ML methods are relatively new to the field, assessing the quality or technical aspects of these reports may be challenging to most non-ML researchers. A recent systematic review by Munguía-Realpozo et al14 assessed 45 SLE reports that used ML to build diagnostic and/or predictive algorithms and determined whether they adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting standards.15 The review concluded that most reports were deficient in multiple domains of the TRIPOD recommendations, often under-reporting relevant details about their data preprocessing, model-building process, model specification and model performance.
In this scoping review, we will discuss ML approaches used in SLE reports following the outline of an ML pipeline (figure 2). While the aim of the study was not to systematically evaluate the reporting adherences of these reports, in general, we found similar limitations identified by Munguía-Realpozo et al.14 This highlights that there is a need to improve transparency and reporting of prediction models in future ML SLE studies.
Data collection
Given that SLE is an uncommon disease, it was not unexpected that the sample sizes for most reports (median 158 patients with SLE (IQR 61–681)) were relatively small. Overfitting and inappropriate generalisation from a small training dataset are important limitations of ML.16 Twenty-five (18.7%) reports evaluated greater than 1000 patients and seven (5.2%) reports assessed greater than 5000 patients. Most of these larger reports used EMRs and administrative databases to identify patients with SLE, recognising that these types of data may be limited by diagnostic misclassification.17–21 Many reports experienced ‘class imbalance’, where the SLE group sample size was considerably smaller compared with healthy controls, potentially biasing ML in favour of the more prevalent class. To address this, some reports used generative adversarial networks22 23 and Synthetic Minority Oversampling TEchnique (SMOTE)20 24 to generate synthetic data.
The data density for most SLE reports does not derive from the patient cohort size but from the large number of variables on each patient considered for the ML models. Types of data used included demographic (n=43 reports) and clinical (n=51) data from cohort registries and several using EMRs (n=13). Data from biopsies included renal (n=6) and lymph node tissue (n=1). Biomarker data included autoantibodies (n=37), immune cell subtypes (eg, CD4+ and CD8+ T cells) (n=26) and other immune markers (eg, complement levels, platelet counts) (n=24), cytokines (n=8), genetics and transcriptomics (n=47), urinary markers (n=9), proteomics (n=5) and lipidomics/metabolomics (n=11). The application of ML to genetics and transcriptomics (eg, RNA sequencing (RNA-seq)) is particularly popular, largely due to the flexibility of ML for managing the vast amount of data obtained from each patient. The feature selection and dimensionality reduction techniques of ML offer a means to handle the large number of potentially relevant covariables. Alternative datasets included images (brain MRI for neuropsychiatric SLE (NPSLE) (n=9), clinical images of cutaneous lupus erythematosus (LE) (n=1) and funduscopic images for lupus retinopathy (n=1)), EKG abnormalities (n=1) and meteorological/environmental indicators (eg, air humidity, air pressure, sulfur dioxide, nitrogen dioxide, particle pollution from fine particulates) (n=2).
Data preprocessing and splitting
As identified by Munguía-Realpozo et al,14 handling of missing data was a major limitation of SLE reports. Median imputation and removal of data to use complete cases were common methods. Four reports used multiple imputation by chained equations,25 a more advanced imputation methodology, and six reports used SMOTE24 to address class imbalance with respect to missing data. Some ML models such as extreme gradient boosting (XGBoost)26 are able to address missing data due to built-in imputation functions.
Accurate data labelling is particularly important for diseases that are heterogeneous with a fluctuating and variable disease course, such as SLE. Identification of SLE cases using EMRs may be inaccurate and inefficient as it relies on coding systems such as the International Classification of Diseases (ICD), which historically has poor diagnostic specificity.27 Similarly, identification of SLE-related manifestations may be challenging given the wide range of features and lack of specific administrative codes for different phenotypes of presentation. Some manifestations are difficult to distinguish between primary and secondary features of SLE, for example, NPSLE versus secondary to other conditions (eg, infections or metabolic disturbances). Regardless of whether a model was developed through traditional means or by ML, any errors in data labelling in the preprocessing stage that are then used to train the model will continue to mislabel future cases. To overcome this, one ML SLE study used a technique called ‘noisy labeling’, where the training labels were created using EMR data based on a threshold of multiple ICD-9 codes, followed by model testing against expert clinician-labelled data with good performance metrics.28
Most reports split a single dataset into three groups: training, validation, and a testing set. While this is an acceptable approach to internally validate a model, an external validation dataset with an independent cohort of patients is needed to ensure replicability and generalisability of the model before clinical adoption and to assess the degree of potential model overfitting.29 We discuss external validation separately below.
Feature selection and dimension reduction
Feature selection methods were primarily random forest (RF) (n=41), followed by LASSO (n=21), and SVM (n=16). Several reports also used filter methods such as relief-based feature selection (n=7) and mutual information (n=2), which were often performed in reports that used genetic datasets. Dimensionality reduction techniques were applied (n=32), which included principal component analysis (PCA) (n=19), t-distributed stochastic neighbour embedding30 (n=9), and Uniform Manifold Approximation and Projection31 (n=7).
Model development
Most reports (n=102) developed one or more prediction algorithms. The remaining reports (n=31) focused only on the identification of SLE clusters or features, for example, biomarkers. For supervised models, the most common technique was RF (n=49), followed by SVM (n=42), logistic regression (LR) (n=42), ANNs32 (n=24), XGBoost (n=20), LASSO (n=17), decision trees (n=16), Naïve Bayes33 (n=14), and k-nearest neighbour (n=13). A few reports used a gradient-boosted tree,34 classification and regression tree35 and light gradient-boosting machine.36 For unsupervised models, primarily clustering and dimensionality reduction were performed, for example, hierarchica (n=9) and k-means clustering (n=9).
There was an increasing number of SLE reports using deep learning methods over time; in this review, 34 such reports were identified. Most of the reports (n=23) included a simple neural network with one or two hidden layers as a comparison between other techniques. As little hyperparameter optimisation was done, these ANNs often were outperformed by models such as RF, SVM and XGBoost. Even with tasks such as natural language processing which commonly use deep learning models like recurrent neural networks (RNN),32 one study found that RF outperformed deep learning models when proper preprocessing and feature selection were performed.37
RNN and its derivatives (long short-term memory (LSTM)38 and gated recurrent unit)39 are typically used in natural language processing, time series data and large image data. In terms of SLE reports, these models were used to analyse EMR data for hospitalisation risk20 40–42 and image data for SLE diagnosis.43 44 However, no report in our review used large language models and attention to text data was uncommon, highlighting the need for more complex models in analysing text data from electronic health records.
Five reports used convolutional neural networks (CNN)45 on image data with topics ranging from NPSLE diagnosis from MRI images,46 diagnosis of SLE retinopathy from funduscopic images,23 diagnosis of cutaneous lupus from lesion images,47 segmentation of staining from lupus nephritis (LN) pathology images48 and segmentation of glomeruli on LN biopsy images.43 Three of the reports used a deep learning technique called Grad-CAM49 that identifies the region of an image that will contribute the most to the final model. As SLE imaging data can be challenging to obtain with large enough numbers for robust ML reports, an ML technique called transfer learning was used to create powerful discriminative models, even with sparse data. Four reports in this review used this method to work with the smaller datasets.23 43 47 48 Liu et al23 posited that transfer learning using diabetic retinopathy funduscopic images could serve as a strong base model for lupus retinopathy prediction as the base model would be more ‘accustomed’ to pathological fundus images. This principle could be applied to other areas of SLE research such as the diagnosis of cutaneous LE through images of other skin lesions from similar but more common diseases including psoriasis.
Model evaluation
The approach taken by most reports was to develop multiple ML models and then select the best model, usually based on the AUC. Other performance metrics including the F-score were not always used or even reported. This is similar to the findings by Munguía-Realpozo et al, where only 21 (46.7%) reported AUC as their main performance metric, seven (15.6%) reported accuracy as their performance metric and the remaining used a combination of performance statistics.14 Five (11.1%) reports in their review did not report any performance metrics.
In our review, RF (n=25), SVM (n=16), XGBoost (n=10), LR (n=10) and LASSO (n=7) models were often reported as the best performing models, compared with more complex models like ANNs (n=4), LSTM (n=3) and CNN (n=2). As many of the datasets of the reports included in this review have features on the scale of 100s, we expect that simpler models would perform better compared with gradient boosting and neural networks that require larger datasets, and where performance is enhanced with multiple layers of data. Additionally, models such as RF and LASSO have capabilities for feature importance, which helps with explainability such as identifying important clinical and genetic biomarkers for future research.
External validation
Overfitting of ML models to the training datasets should be evaluated using optimism-adjusted measures. Although these can be approximated using internal validation (eg, data splitting), they are more robustly assessed using external validation from a separate data cohort. This step ensures that the developed ML model is generalisable beyond the collected data alone. Only 15 reports in our review specified that they evaluated their model using an external cohort. External validation is particularly relevant for complex ‘black box’ ML models such as deep learning. In deep learning, the internal processes of the model are usually unknown or ‘hidden’. This makes it difficult to assess whether certain model features could be subject to selection or other biases that may affect the generalisability of the model.50 51
Key SLE findings by ML reports
In our scoping review, ML models were used to elucidate disease pathogenesis (n=31),48 52–81 predict SLE diagnosis and identify cases (n=61),23 28 37 43 44 46 47 53 63 70 73 79 82–130 disease activity and treatment response (n=33),56 59 63 66 69 74 77 78 101 106 113 131–152 complications (n=22)18 21 40 53 83 147 153–168 and healthcare utilisation (n=6)17 20 41 42 142 169 (table 2). Refer to online supplemental table 1 for a glossary of key terms.
Table 2Key SLE findings in machine learning studies
Type of study | References | Type of dataset | Types of machine learning models |
Pathogenesis | Endotypes, immune dysregulation, genetic risk48 52–81 | Autoantibodies, clinical, cytokines, demographic, EMR, genetics, images (renal histopathology slides), immune cell types, urinary biomarkers | ANN, CART, CNN, decision tree, ElasticNet, GLM, gradient tree boosting, hierarchical clustering, k-means, KNN, LASSO, LR, Naïve Bayes, PCA, RF, RFE, ridge regression, SHAP, SMOTE, SVM, t-SNE, UMAP, XGBoost, other |
Diagnosis | Differentiate between healthy controls, other autoimmune diseases (rheumatoid arthritis and systemic sclerosis)23 28 37 43 44 47 53 63 70 79 82–121 | Autoantibodies, biopsy, clinical, demographics, EMR, genetics, other immune markers and cell subtypes, other omics, images (MRI, funduscopic images), protein, urinary biomarkers | AdaBoost, ANN, CART, CNN, decision tree, GLM, gradient tree boosting, hierarchical clustering, KNN, k-means, LASSO, LDA, LGB, LR, Naïve Bayes, natural language, noisy labelling, PCA, RF, RFE, ridge regression, RNN, SHAP, SMOTE, SVM, XGBoost, UMAP, other |
Lupus nephritis85 107 112 117 118 122 | Autoantibodies, biopsy, clinical, cytokines, demographics, EMR, genetics, immune cell types, lipidomics, metabolomics, urinary biomarkers | ANN, CART, decision tree, hierarchical clustering, KNN, LASSO, LDA, LR, Naïve Bayes, RF, RFE, SVM, other | |
Neuropsychiatric SLE including anxiety, depression, cognitive impairment44 46 100 102 109–111 121 123–130 | Autoantibodies, clinical, demographic, immune cells, 1H-MRS images, metabolites, MRI images, proteins | AdaBoost, ANN, CNN, decision tree, ELM, GLM, gradient descent, gradient tree boosting, GRU, hierarchical clustering, k-means, KNN, LASSO, LR, LSTM, Naïve Bayes, natural language, PCA, RF, RFE, ridge regression, RNN, SHAP, SVM, XGBoost, other | |
Cutaneous47 73 | Cytokines, images, immune cells, genetics | CART, CNN, gradient descent, hierarchical clustering, LR, Naïve Bayes, RF, SMOTE, SVM, other | |
Disease activity | Renal flares131–136 | Autoantibodies, biopsy features, clinical characteristics, cytokines, demographics, genetics, renal ultrasonic radiomics, urinary biomarkers | ANN, CART, LASSO, LR, RF, RFE, SVM, XGBoost |
Extrarenal flares, SLEDAI score59 63 66 101 106 113 137–148 | Autoantibodies, biometrics data, clinical, demographics, EMR, genetics, immune cell subtypes by PBMC scRNA-seq, patient-reported outcomes, meteorological data, other omics, proteomics, quality of life | Adaptive boosting, ANN, Bayesian network, CART, consensus clustering, decision tree, ElasticNet, GLM, gradient tree boosting, hierarchical clustering, k-means, KNN, LDA, LGB, LR, multivariable ordinal regression, Naïve Bayes, NLP, PCA, ReliefF model, RF, ridge regression, SHAP, SMOTE, SVM, t-SNE, UMAP, XGBoost, other | |
Treatment response56 69 74 77 78 113 131 132 134 139 142 149–152 | Autoantibodies, biopsy, clinical, cytokines, demographic, genetics, immune cells, urine | ANN, CART, consensus clustering, decision tree, GLM, hierarchical clustering, k-means, LASSO, LDA, LR, Naïve Bayes, PCA, PLS-DA, ReliefF, SMOTE, SVM, t-SNE, XGBoost | |
Disease complications | Organ damage40 | Autoantibodies, clinical, demographics, EMR | ANN, LR, RNN |
Atherosclerosis, cardiovascular events, arrhythmia, heart failure18 153–159 168 | Carotid intima thickness, clinical, demographics, EKG, EMR, genetics, lipids/metabolites | ANN, consensus clustering, decision tree, hierarchical clustering, KNN, LASSO, LDA, LR, multivariate adaptive regression spline, PCA, PLS-DA, RF, SMOTE, SVM | |
Antiphospholipid syndrome/thrombosis21 53 | Autoantibodies, clinical, demographic, genetic | Hierarchical clustering, KNN, LASSO, LR, Naïve Bayes, RF | |
Adverse pregnancy outcome147 160–162 | Autoantibodies, clinical, demographic, genetic, metabolites | ANN, decision trees, ElasticNet, gradient descent, hierarchical clustering, KNN, LASSO, LDA, LR, PCA, PLS-DA, RFE, RF, super learner, SVM | |
Renal transplant163 | Autoantibodies, clinical, demographic, genetic, images | ANN, ‘bestfire’ feature selection, decision tree, ‘genetic search’ feature selection, LR | |
Other: hypothyroidism, herpes, breast cancer, joint erosion83 164–167 | Clinical, demographic, immune cells, autoantibodies | ANN, decision tree, hierarchical clustering, KNN, LASSO, LR, RF, RFE, SVM, UMAP, XGBoost, other | |
Healthcare resource utilisation | Hospitalisation and costs17 20 41 42 142 169 | Administrative database, clinical, demographics, EMR | ANN, decision tree, ElasticNet, GRU, hierarchical clustering, k-means, KNN, LASSO, LR, LSTM, Naïve Bayes, natural language processing, RF, RFE, RNN, SHAP, SMOTE, XGBoost, other |
ANN, artificial neural network; CART, classification and regression tree; CNN, convolutional neural network; ELM, extreme learning machine; EMR, electronic medical record; GLM, generalised linear model; GRU, gated recurrent units; 1H-MRS, proton magnetic resonance spectroscopy; KNN, k-nearest neighbour; LASSO, least absolute shrinkage and selection operator; LDA, linear discriminant analysis; LGB, light gradient-boosting machine; LR, logistic regression; LSTM, long short-term memory; NLP, natural language processing; PBMC, peripheral blood mononuclear cell; PCA, principal component analysis; PLS-DA, partial least squares discriminant analysis; RF, random forest; RFE, recursive feature elimination; RNN, recurrent neural network; scRNA-seq, single-cell RNA sequencing; SHAP, SHapley Additive exPlanations; SLEDAI, Systemic Lupus Erythematosus Disease Activity Index; SMOTE, Synthetic Minority Oversampling TEchnique; SVM, support vector machine; t-SNE, t-distributed stochastic neighbour embedding; UMAP, Uniform Manifold Approximation and Projection; XGBoost, extreme gradient boosting.
Pathogenesis
Among the reports that examined SLE pathogenesis, many used genetic and RNA-seq datasets. Novel markers identified by these reports include ST8SIA4,57 CMTM4,57 C2CD4B,57 LCK,69 cuproptosis-related genes,72 TNFSF13B,79 OAS1,79 ABCB1,81 CD247,81 DSC1,81 KIR2DL381 and MX2.81 Immune-related biomarkers including autoantibodies, immune cell subtypes and cytokines were also analysed. These were often combined with other clinical features to reveal unique SLE endotypes via cluster analysis.62 63 70 74 75 80 Important immune pathways were identified including extrafollicular B cell involvement,54 DNA methylation,65 expansion of major helper T cell subsets and unique proliferating (Ki-67+) immune cell subsets,76 and signalling lymphocytic activation molecule family receptors on peripheral blood mononuclear cells.64
Diagnostic models
SLE diagnostic models were used to identify patients with SLE compared with healthy controls and from other autoimmune rheumatic diseases (eg, rheumatoid arthritis, Sjögren disease, systemic sclerosis, multiple sclerosis), Kikuchi disease and other forms of nephropathy for LN reports23 28 37 43 44 47 53 63 70 79 82–121 (AUC 0.70–0.99). A validated diagnostic algorithm called the SLE Risk Probability Index (SLERPI) was developed using LASSO-LR based on 14 SLE clinical and serological features.84 A SLERPI score of greater than 7 was highly accurate (94.2%) and sensitive for detecting early disease (93.8%) and severe manifestations including kidney (97.9%) and neuropsychiatric involvement (91.8%). There were also specific diagnostic algorithms for LN,85 107 112 117 118 122 NPSLE44 46 100 102 109–111 121 123–130 and cutaneous LE.47 73 Cases of SLE28 98 114 and births from mothers with SLE87 could also be derived from EMR using ML.
Reports using genomic and genetic expression datasets identified several important biomarker for LN including C1QA, C1QB, MX1, RORC, CD177, DEFA4, and HERC5 for LN.118 For non-renal SLE, FOXP3,88 MX2,106 HLA-DQA1,90 HLA-DQB1,90 HLA-DRB1,90 neutrophil extracellular trap-related genes (HMGB1, ITGB2 and CREB5),70 ABCB1,120 IFI27120 and PLSCR1120 have been reported. Other types of biomarkers included proteomics (IFIT3, MX1, TOMM40, STAT1, STAT2 and OAS3),101 metabolomics,91 lipidomics94 and microRNA profiles.108
For the detection of LN, novel serum biomarkers as a form of ‘liquid biopsy’ included circulating cell-free methylated DNA.117 For NPSLE, different T and B cell subsets predicted depression in patients with SLE.127 128 Proteomics using cerebrospinal fluid demonstrated that CST6, L-selectin, Trappin-2, KLK5 and TCN2 could distinguish NPSLE from SLE controls (non-NPSLE).124 Other reports using single-cell RNA sequencing data compared biomarkers for NPSLE to multiple sclerosis103 and vascular dementia.83 To differentiate cutaneous LE from other dermatological disorders such as psoriasis, eczema, atopic dermatitis and systemic sclerosis (RF model, AUC 0.774–0.990), interferon gene signature, tumour necrosis factor, interleukin-23 (IL-23), interferon (IFN), IL-12, and immune cell-related genetic signatures were selected as important biomarkers.73
A variety of images were analysed using ML including brain MRI (functional MRI, cerebral perfusion, multivoxel proton magnetic resonance spectroscopy) for the detection of NPSLE,44 46 100 102 109–111 121 129 130 funduscopic images for lupus retinopathy23 and clinical images of the skin for acute cutaneous LE, subcutaneous LE and discoid LE.47
Disease activity and treatment response
For predicting renal flares,131–135 152 the best performing models contained both traditional clinical data and novel urine biomarkers, including cytokines, chemokines and/or markers of kidney damage. The best models for predicting renal flares from these studies included XGBoost and ANN (AUCs 0.70–0.94). Quantitative data extracted from renal ultrasound based on features such as texture, shape and wavelength could also detect LN activity.133 Novel biomarkers for LN activity include renal IFI16135 and V-set immunoglobulin domain-containing protein 4.136
For extrarenal flares,59 63 66 101 106 113 137–148 approximately half of the reports used genetic or genetic expression datasets. Novel biomarkers predicting SLE flares or increased disease activity include MX2106 and M1143 gene expression and a nine-protein combination (PHACTR2, GOT2, L-selectin, CMC4, MAP2K1, CMPK2, ECPAS, SRA1 and STAT2).101 The AUC of the best models in these reports ranged from 0.70 to 0.99. One study also demonstrated that Systemic Lupus Erythematosus Disease Activity Index score can be estimated from unstructured clinical notes.137
Treatment response was predicted with a high degree of accuracy in some reports with the outcome of renal flares being the most commonly evaluated.69 131 132 134 149 152 Clinical factors identified using feature importance ML models included C3, C4, age, race, sex, anti-dsDNA, baseline estimated glomerular filtration rate, urine protein-to-creatinine ratio as well as cytokine/protein factors such as CXCL8, pentraxin, adiponectin, MCP1, IL-8, IL-1a, IL-12, IL-6, IFNa2 and IFNy.131 152 The top performing predictive models for treatment response used a simple neural network (AUC 0.9735)134 and an RF model (AUC 0.92).131 Predictors of disease remission (SVM, AUC 0.713)139 and response to B cell therapies (RF, AUC 0.88)77 were examined as well. Lastly, cluster analysis by k-means and consensus cluster to identify different SLE endotypes based on treatment response revealed a wide range of results, for example, the number of reported clusters ranged from 3 to 39.56 74 78 113 142 150 In our own study of 805 patients with SLE from the Systemic Lupus International Collaborating Clinics (SLICC) cohort, k-means clustering on PCA-transformed longitudinal autoantibody profiles over the first 5 years of disease revealed four distinct endotypes that were predictive of long-term disease activity, organ involvement, treatment requirements, and mortality risk.56
Prognostic models
For the prediction of SLE outcomes, ML has been used to predict disease damage (RN, AUC 0.77).40 Prediction of cardiovascular disease (atherosclerosis, cardiovascular events, arrhythmia and heart failure), a major cause of mortality in SLE, has been evaluated.153–159 168 Novel lipoprotein metabolites and deficiency in vitamin D were associated with atherosclerosis.153 157 158 Several candidate hub genes (SPI1, MMP9, C1QA, CX3CR1, MNDA) could predict the risk of atherosclerosis in SLE, and expression of CCR7, RNASE2, RNASE3 and CXCL10 genes for heart failure. The AUCs ranged from 0.81 to 0.98 for the various models.154 A prediction score called SLE-venous thromboembolism (VTE) could predict VTE risk in patients with SLE (LR, AUC 0.808) based on 11 variables: sex, age, body mass index, hyperlipidaemia, hypoalbuminaemia, C reactive protein, anti-ß2-glycoprotein I antibodies, lupus anticoagulant, renal involvement, nervous system involvement and hydroxychloroquine use.21 A prediction model for 3-year allograft survival in kidney transplant recipients with SLE has also been developed (LR and ANN, AUC 0.73) using recipient age, race, maintenance regimen including prednisone, maintenance regimen, predominate renal replacement modality in the pretransplant period, and whether dialysis was required during the first post-transplant week.163
Adverse pregnancy outcomes in patients with SLE were examined using different datasets. SLE activity was predicted in pregnant women (ElasticNet, AUC 0.978) using serum metabolites (glucose, alanine, acetoacetic acid and alpha-ketoisovalerate levels).147 Potential genetic biomarkers identified with ML for predicting adverse pregnancy outcomes during early and mid-pregnancy in patients with SLE are SEZ6, NRAD1 and LPAR4.160 There were also prediction models that used only routinely available clinical variables (eg, levels of alanine transaminase, alkaline phosphatase, lactate dehydrogenase, gamma-glutamyl transferase, erythrocytes, C3, C4, autoantibodies as well as maternal age, smoking status, hydroxychloroquine use and disease duration) (super learning, AUC 0.78; RF, AUC 0.917).161 162
Other outcomes for patients with SLE included reduced risk of breast cancer with the presence of prognostic genetic biomarkers (ie, IRF7, IFI35 and EIF2AK2 gene expression) identified with LASSO.165 Models for the prediction of joint erosions LR model (AUC 0.806),164 herpes infection (RF, AUC 0.942)167 and hypothyroidism (RF, AUC 0.772)166 have also been developed using clinical and serological data. Among the selected features for these models, autoantibodies were found to be important predictors, for example, anti-carbamylated protein and anti-citrullinated protein antibodies for joint erosion164 and anti-dsDNA and anti-SSB/La for hypothyroidism.167 ML models showed promise in predicting the risk of hospitalisation and length of stay from EMR data (best performing models LSTM and XGBoost, AUC 0.88)20 41 42 142 169 and associated healthcare costs from administrative databases.17 142
Future considerations
AI applications have become ubiquitous in medicine, and their impact on SLE care and research is no exception.170 The range of AI applications and utilisation in SLE is expected to grow. Thus far, considerable work in SLE has been focused on developing ML models to predict disease, diagnosis and prognosis. Other applications of AI in SLE including drug discovery, clinical trial design and interpretation,171 diagnostic imaging analysis, personalised medicine and medical devices and technologies are just beginning. Increased availability and access to other types of data in the future will provide even more opportunities for SLE research. ML approaches in SLE may even make use of health data collected from mobile phones, wearable devices, social media and environmental datasets, which are becoming more popular in health research. Integration of more advanced ML methods in future reports will also allow for more efficient analysis of increasingly large and complex datasets. As discussed, there is already evidence of this trend with increased utilisation of deep learning and natural language processing approaches in SLE.
While AI facilitates discoveries that may improve patient outcomes and processes in the healthcare system, researchers should also be aware of the ethical, governance and regulatory considerations, including patient consent, confidentiality, transparency and privacy172 173 (figure 2). In 2019, the European League Against Rheumatism published recommendations that guide researchers on the collection, analysis, interpretation and implementation of big data through AI/ML.174 While these are not discussed in detail in this review, we emphasise that these issues can arise at any step of the ML pipeline. For instance, during data collection, diverse data sources increasingly used by ML approaches (eg, EMR, administrative databases, social media, genetic or other multi-omics datasets, clinical trials and microbiome) are prone to potential sampling biases. These can exacerbate existing disparities in marginalised and underserved populations and violate the bioethical principles of justice. SLE is a disease that disproportionately affects racial and ethnic minorities and is therefore more sensitive to these issues (reviewed in ref 175). The lack of representation by minority populations in clinical research, genetic reports176 and clinical trials177 is a real concern. More work is needed to study how we can address these issues, minimise harm and promote ethical ML models in the future.
Ethics statements
Patient consent for publication
Not applicable.
Correction notice This article has been corrected since it was published. The provenance and peer review statement in the paper has been corrected.
Contributors All authors were involved in the concept and design, data analysis and interpretation, and editing for intellectual content. KZ, KAB, IYC, MJF and MYC were involved in manuscript drafting.
Funding Support for this study also came from the Lupus Foundation of America.
Competing interests MJF is a consultant to and has received honoraria and/or travel support from Werfen (Barcelona, Spain; San Diego, California). MJF is also Medical Director of Mitogen Diagnostics. MYC has received consulting fees from Celltrion, Mallinckrodt Pharmaceuticals, Werfen, Organon, AstraZeneca, and MitogenDx.
Provenance and peer review Commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
1 Kingsmore KM, Puglisi CE, Grammer AC, et al. An introduction to machine learning and analysis of its use in rheumatic diseases. Nat Rev Rheumatol 2021; 17: 710–30. doi:10.1038/s41584-021-00708-w
2 Hügle M, Omoumi P, van Laar JM, et al. Applied machine learning and artificial intelligence in rheumatology. Rheumatol Adv Pract 2020; 4: rkaa005. doi:10.1093/rap/rkaa005
3 Kim K-J, Tagkopoulos I. Application of machine learning in rheumatic disease research. Korean J Intern Med 2019; 34: 708–22.: 708. doi:10.3904/kjim.2018.349
4 Stoel B. Use of artificial intelligence in imaging in rheumatology–current status and future perspectives. RMD Open 2020; 6: e001063. doi:10.1136/rmdopen-2019-001063
5 Jiang M, Li Y, Jiang C, et al. Machine learning in rheumatic diseases. Clin Rev Allergy Immunol 2021; 60: 96–110. doi:10.1007/s12016-020-08805-6
6 Thorndike RL. Who belongs in the family Psychometrika 1953; 18: 267–76. doi:10.1007/BF02289263
7 Pelleg D, Moore AW. X-means: extending K-means with efficient estimation of the number of clusters. Icml 2000: 727–34.
8 Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap Statistic. J R Stat Soc Series B Stat Methodol 2001; 63: 411–23. doi:10.1111/1467-9868.00293
9 Li J, Cheng K, Wang S, et al. Feature selection: A data perspective. ACM Computing Surveys 2017; 50: 1–45. doi:10.1145/3136625
10 Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of Microarray data using random forest. BMC Bioinformatics 2006; 7: 1–13.: 3. doi:10.1186/1471-2105-7-3
11 Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 1996; 58: 267–88. doi:10.1111/j.2517-6161.1996.tb02080.x Available: https://rss.onlinelibrary.wiley.com/toc/25176161/58/1
12 Burges CJC. A Tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 1998; 2: 121–67. doi:10.1023/A:1009715923555
13 Stafford IS, Kellermann M, Mossotto E, et al. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020; 3: 30. doi:10.1038/s41746-020-0229-3
14 Munguía-Realpozo P, Etchegaray-Morales I, Mendoza-Pinto C, et al. Current state and completeness of reporting clinical prediction models using machine learning in systemic lupus erythematosus: a systematic review. Autoimmun Rev 2023; 22: 103294. doi:10.1016/j.autrev.2023.103294
15 Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement. Circulation 2015; 131: 211–9. doi:10.1161/CIRCULATIONAHA.114.014508
16 Choi MY, Ma C. Making a big impact with small Datasets using machine-learning approaches. Lancet Rheumatol 2020; 2: e451–2. doi:10.1016/S2665-9913(20)30217-4
17 Castro-Villarreal S, Beltran-Ostos A, Valencia CF. Estimation of prevalence and incremental costs of systemic lupus erythematosus in a middle-income country using machine learning on administrative health data. Value Health Reg Issues 2021; 26: 98–104. doi:10.1016/j.vhri.2021.04.005
18 Cohen IV, Makunts T, Moumedjian T, et al. Cardiac adverse events associated with chloroquine and hydroxychloroquine exposure in 20 years of drug safety surveillance reports. Sci Rep 2020; 10: 19199. doi:10.1038/s41598-020-76258-0
19 Grovu R, Huo Y, Nguyen A, et al. Machine learning: predicting hospital length of stay in patients admitted for lupus flares. Lupus 2023; 32: 1418–29. doi:10.1177/09612033231206830
20 Reddy BK, Delen D. Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput Biol Med 2018; 101: 199–209. doi:10.1016/j.compbiomed.2018.08.029
21 You H, Zhao J, Zhang M, et al. Development and external validation of a prediction model for venous thromboembolism in systemic lupus erythematosus. RMD Open 2023; 9: e003568. doi:10.1136/rmdopen-2023-003568
22 Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Adv Neural Inf Process Syst 2014: 27.
23 Liu R, Wang T, Li H, et al. TMM-nets: transferred multi- to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans Med Imaging 2023; 42: 1083–94. doi:10.1109/TMI.2022.3223683
24 Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority over-sampling technique. Jair 2002; 16: 321–57. doi:10.1613/jair.953
25 Azur MJ, Stuart EA, Frangakis C, et al. Multiple imputation by chained equations: what is it and how does it work Int J Methods Psychiatr Res 2011; 20: 40–9. doi:10.1002/mpr.329
26 Chen T, Guestrin C. Xgboost: A Scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016: 785–94 doi:10.1145/2939672.2939785
27 Moores KG, Sathe NA. A systematic review of validated methods for identifying systemic lupus erythematosus (SLE) using administrative or claims data. Vaccine 2013: K62–73. doi:10.1016/j.vaccine.2013.06.104
28 Murray SG, Avati A, Schmajuk G, et al. Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. J Am Med Inform Assoc 2019; 26: 61–5. doi:10.1093/jamia/ocy154
29 Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res 2017; 26: 796–808. doi:10.1177/0962280214558972
30 Van der L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res 2008; 9: 11.
31 McInnes L, Healy J, Saul N, et al. n.d. Umap: uniform manifold approximation and projection for dimension reduction. JOSS; 3: 861. doi:10.21105/joss.00861
32 Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-Propagating errors. Nature 1986; 323: 533–6. doi:10.1038/323533a0
33 Hart PE, Stork DG, Duda RO. Pattern classification. Wiley Hoboken, 2000.
34 Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Statist 2001; 29: 1189–232. doi:10.1214/aos/1013203451
35 Breiman L, Friedman JH, Olshen RA, et al. Classification and regression trees Belmont. CA: Wadsworth International Group, 1984.
36 Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 2017: 30.
37 Ma W, Lau YL, Yang W, et al. Random forests algorithm BOOSTS genetic risk prediction of systemic lupus erythematosus. Front Genet 2022; 13: 902793. doi:10.3389/fgene.2022.902793
38 Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997; 9: 1735–80. doi:10.1162/neco.1997.9.8.1735
39 Cho K, van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN Encoder–Decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP. Stroudsburg, PA, USA: Doha, Qatar, 2014. doi:10.3115/v1/D14-1179
40 Ceccarelli F, Sciandrone M, Perricone C, et al. Prediction of chronic damage in systemic lupus erythematosus by using machine-learning models. PLoS One 2017; 12: e0174200. doi:10.1371/journal.pone.0174200
41 Zhao Y, Smith D, Jorge A. Comparing two machine learning approaches in predicting lupus hospitalization using longitudinal data. Sci Rep 2022; 12: 16424. doi:10.1038/s41598-022-20845-w
42 Jorge AM, Smith D, Wu Z, et al. Exploration of machine learning methods to predict systemic lupus erythematosus hospitalizations. Lupus 2022; 31: 1296–305. doi:10.1177/09612033221114805
43 Yang C-K, Lee C-Y, Wang H-S, et al. Glomerular disease classification and lesion identification by machine learning. Biomed J 2022; 45: 675–85. doi:10.1016/j.bj.2021.08.011
44 Yuan Y, Quan T, Song Y, et al. Noise-immune extreme ensemble learning for early diagnosis of neuropsychiatric systemic lupus erythematosus. IEEE J Biomed Health Inform 2022; 26: 3495–506. doi:10.1109/JBHI.2022.3164937
45 Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep Convolutional neural networks. Commun ACM 2017; 60: 84–90. doi:10.1145/3065386
46 Inglese F, Kim M, Steup-Beekman GM, et al. MRI-based classification of neuropsychiatric systemic lupus erythematosus patients with self-supervised Contrastive learning. Front Neurosci 2022; 16: 695888. doi:10.3389/fnins.2022.695888
47 Wu H, Yin H, Chen H, et al. A deep learning-based Smartphone platform for cutaneous lupus erythematosus classification assistance: SIMPLIFYING the diagnosis of complicated diseases. Journal of the American Academy of Dermatology 2021; 85: 792–3. doi:10.1016/j.jaad.2021.02.043
48 Kinloch AJ, Asano Y, Mohsin A, et al. Machine learning to quantify in situ humoral selection in human lupus tubulointerstitial inflammation. Front Immunol 2020; 11: 593177. doi:10.3389/fimmu.2020.593177
49 Selvaraju RR, Cogswell M, Das A, et al. Grad-Cam: visual explanations from deep networks via gradient-based localization. 2017 IEEE International Conference on Computer Vision (ICCV) Venice 2017: 618–26. doi:10.1109/ICCV.2017.74
50 Savage N. Breaking into the black box of artificial intelligence. Nature March 29, 2022. doi:10.1038/d41586-022-00858-1
51 Yu AC, Eng J. One algorithm may not fit all: how selection bias affects machine learning performance. Radiographics 2020; 40: 1932–7. doi:10.1148/rg.2020200040
52 Arabnejad M, Montgomery CG, Gaffney PM, et al. Nearest-neighbor projected distance regression for Epistasis detection in GWAS with population structure correction. Front Genet 2020; 11: 784. doi:10.3389/fgene.2020.00784
53 Armañanzas R, Calvo B, Inza I, et al. Microarray analysis of autoimmune diseases by machine learning procedures. IEEE Trans Inf Technol Biomed 2009; 13: 341–50. doi:10.1109/TITB.2008.2011984
54 Baxter RM, Wang CS, Garcia-Perez JE, et al. Expansion of Extrafollicular B and T cell Subsets in childhood-onset systemic lupus erythematosus. Front Immunol 2023; 14: 1208282. doi:10.3389/fimmu.2023.1208282
55 Catalina MD, Bachali P, Yeo AE, et al. Patient ancestry significantly contributes to molecular heterogeneity of systemic lupus erythematosus. JCI Insight 2020; 5: 15.: e140380. doi:10.1172/jci.insight.140380
56 Choi MY, Chen I, Clarke AE, et al. Machine learning identifies clusters of longitudinal autoantibody profiles predictive of systemic lupus erythematosus disease outcomes. Ann Rheum Dis 2023; 82: 927–36. doi:10.1136/ard-2022-223808
57 Davis NA, Lareau CA, White BC, et al. Encore: genetic Association interaction network Centrality pipeline and application to SLE Exome data. Genet Epidemiol 2013; 37: 614–21. doi:10.1002/gepi.21739
58 Devaprasad A, Radstake TRDJ, Pandit A. Integration of Immunome with disease-gene network reveals common cellular mechanisms between Imids and drug Repurposing strategies. Front Immunol 2021; 12: 669400. doi:10.3389/fimmu.2021.669400
59 Falk I, Zhao M, Nait Saada J, et al. Learning the kernel for rare variant genetic Association test. Front Genet 2023; 14: 1245238. doi:10.3389/fgene.2023.1245238
60 Figgett WA, Monaghan K, Ng M, et al. Machine learning applied to whole-blood RNA-sequencing data Uncovers distinct Subsets of patients with systemic lupus erythematosus. Clin Transl Immunology 2019; 8: e01093. doi:10.1002/cti2.1093
61 Foulquier N, Le Dantec C, Bettacchioli E, et al. Machine learning for the identification of a common signature for anti-SSA/Ro 60 antibody expression across autoimmune diseases. Arthritis Rheumatol 2022; 74: 1706–19. doi:10.1002/art.42243
62 Guthridge JM, Lu R, Tran LT-H, et al. Adults with systemic lupus exhibit distinct molecular phenotypes in a cross-sectional study. EClinicalMedicine 2020; 20: 100291. doi:10.1016/j.eclinm.2020.100291
63 Hubbard EL, Bachali P, Kingsmore KM, et al. Analysis of Transcriptomic features reveals molecular Endotypes of SLE with clinical implications. Genome Med 2023; 15: 84. doi:10.1186/s13073-023-01237-9
64 Humbel M, Bellanger F, Horisberger A, et al. SLAMF receptor expression identifies an immune signature that characterizes systemic lupus erythematosus. Front Immunol 2022; 13: 843059. doi:10.3389/fimmu.2022.843059
65 Imgenberg-Kreuz J, Almlöf JC, Leonard D, et al. Shared and unique patterns of DNA methylation in systemic lupus erythematosus and primary Sjogren’s syndrome. Front Immunol 2019; 10: 1686. doi:10.3389/fimmu.2019.01686
66 Kegerreis B, Catalina MD, Bachali P, et al. Machine learning approaches to predict lupus disease activity from gene expression data. Sci Rep 2019; 9: 9617. doi:10.1038/s41598-019-45989-0
67 Kimura T, Ikeuchi H, Yoshino M, et al. Profiling of kidney involvement in systemic lupus erythematosus by deep learning using the National database of designated incurable diseases of Japan. Clin Exp Nephrol 2023; 27: 519–27. doi:10.1007/s10157-023-02337-x
68 Le TT, Blackwood NO, Taroni JN, et al. Integrated machine learning pipeline for aberrant biomarker enrichment (I-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients. AMIA Annu Symp Proc 2018; 2018: 1358–67.
69 Lee DJ, Tsai PH, Chen CC, et al. Incorporating knowledge of disease-defining Hub genes and regulatory network into a machine learning-based model for predicting treatment response in lupus nephritis after the first renal flare. J Transl Med 2023; 21: 76. doi:10.1186/s12967-023-03931-z
70 Li H, Zhang X, Shang J, et al. Identification of nets-related biomarkers and molecular clusters in systemic lupus erythematosus. Front Immunol 2023; 14: 1150828. doi:10.3389/fimmu.2023.1150828
71 Li H, Zhou J, Zhou L, et al. Identification of the shared gene signatures and molecular pathways in systemic lupus erythematosus and diffuse large B-cell lymphoma. J Gene Med 2023; 25: e3558. doi:10.1002/jgm.3558
72 Li W, Guan X, Wang Y, et al. Cuproptosis-related gene identification and immune infiltration analysis in systemic lupus erythematosus. Front Immunol 2023; 14: 1157196. doi:10.3389/fimmu.2023.1157196
73 Martínez BA, Shrotri S, Kingsmore KM, et al. Machine learning reveals distinct gene signature profiles in lesional and Nonlesional regions of inflammatory skin diseases. Sci Adv 2022; 8: eabn4776. doi:10.1126/sciadv.abn4776
74 Qiao J, Zhang S-X, Chang M-J, et al. Deep stratification by Transcriptome molecular characters for precision treatment of patients with systemic lupus erythematosus. Rheumatology (Oxford) 2023; 62: 2574–84. doi:10.1093/rheumatology/keac625
75 Robinson GA, Peng J, Dönnes P, et al. Disease-associated and patient-specific immune cell signatures in juvenile-onset systemic lupus erythematosus: patient stratification using a machine-learning approach. Lancet Rheumatol 2020; 2: e485–96. doi:10.1016/S2665-9913(20)30168-5
76 Sasaki T, Bracero S, Keegan J, et al. Longitudinal immune cell profiling in patients with early systemic lupus erythematosus. Arthritis Rheumatol 2022; 74: 1808–21. doi:10.1002/art.42248
77 Shipa M, Santos LR, Nguyen DX, et al. Identification of biomarkers to stratify response to B-cell-targeted therapies in systemic lupus erythematosus: an exploratory analysis of a randomised controlled trial. Lancet Rheumatol 2023; 5: e24–35. doi:10.1016/S2665-9913(22)00332-0
78 Toro-Domínguez D, Lopez-Domínguez R, García Moreno A, et al. Differential treatments based on drug-induced gene expression signatures and longitudinal systemic lupus erythematosus stratification. Sci Rep 2019; 9: 15502. doi:10.1038/s41598-019-51616-9
79 Wang Y, Huang Z, Xiao Y, et al. The shared biomarkers and pathways of systemic lupus erythematosus and metabolic syndrome analyzed by Bioinformatics combining machine learning algorithm and single-cell sequencing analysis. Front Immunol 2022; 13: 1015882. doi:10.3389/fimmu.2022.1015882
80 Yones SA, Annett A, Stoll P, et al. Interpretable machine learning identifies Paediatric systemic lupus erythematosus subtypes based on gene expression data. Sci Rep 2022; 12: 7433. doi:10.1038/s41598-022-10853-1
81 Zhao X, Duan L, Cui D, et al. Exploration of biomarkers for systemic lupus erythematosus by machine-learning analysis. BMC Immunol 2023; 24: 44. doi:10.1186/s12865-023-00581-0
82 Du J, Huang H, Pang L, et al. A machine learning model for identifying systemic lupus erythematosus through laboratory information system and electronic medical record. Clin Exp Rheumatol November 15, 2023. doi:10.55563/clinexprheumatol/jvdrpc
83 Chen J, Zhao X, Huang C, et al. Novel insights into molecular signatures and pathogenic cell populations shared by systemic lupus erythematosus and vascular dementia. Funct Integr Genomics 2023; 23: 337. doi:10.1007/s10142-023-01270-2
84 Adamichou C, Genitsaridi I, Nikolopoulos D, et al. Lupus or not? SLE risk probability index (SLERPI): a simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus. Ann Rheum Dis 2021; 80: 758–66. doi:10.1136/annrheumdis-2020-219069
85 Agar JW, Webb GI. Application of machine learning to a renal biopsy database. Nephrol Dial Transplant 1992; 7: 472–8.
86 AlShareedah A, Zidoum H, Al-Sawafi S, et al. Machine learning approach for predicting systemic lupus erythematosus in an Oman-based cohort. Sultan Qaboos Univ Med J 2023; 23: 328–35. doi:10.18295/squmj.12.2022.069
87 Barnado A, Eudy AM, Blaske A, et al. Developing and validating methods to assemble systemic lupus erythematosus births in the electronic health record. Arthritis Care Res (Hoboken) 2022; 74: 849–57. doi:10.1002/acr.24522
88 Birjan Z, Khashei Varnamkhasti K, Parhoudeh S, et al. Crucial role of Foxp(3) gene expression and Mutation in systemic lupus erythematosus inferred from computational and experimental approaches. Diagnostics 2023; 13: 22.: 3442. doi:10.3390/diagnostics13223442
89 Ceccarelli F, Lapucci M, Olivieri G, et al. Can machine learning models support physicians in systemic lupus erythematosus diagnosis? results from a Monocentric cohort. Joint Bone Spine 2022; 89: 105292. doi:10.1016/j.jbspin.2021.105292
90 Chung C-W, Hsiao T-H, Huang C-J, et al. Machine learning approaches for the Genomic prediction of rheumatoid arthritis and systemic lupus erythematosus. BioData Min 2021; 14: 52. doi:10.1186/s13040-021-00284-5
91 Du Q, Wang X, Chen J, et al. Machine learning Encodes urine and serum metabolic patterns for autoimmune disease discrimination, classification and metabolic dysregulation analysis. Analyst 2023; 148: 4318–30. doi:10.1039/d3an01051a
92 Guy RT, Santago P, Langefeld CD. Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease. Genet Epidemiol 2012; 36: 99–106. doi:10.1002/gepi.21608
93 Han Y, Jin Z, Ma L, et al. Development of clinical decision models for the prediction of systemic lupus erythematosus and Sjogren’s syndrome overlap. J Clin Med 2023; 12: 535. doi:10.3390/jcm12020535
94 He J, Ma C, Tang D, et al. Absolute Quantification and characterization of Oxylipins in lupus nephritis and systemic lupus erythematosus. Front Immunol 2022; 13: 964901. doi:10.3389/fimmu.2022.964901
95 Huang Z, Shi Y, Cai B, et al. MALDI-TOF MS combined with magnetic beads for detecting serum protein biomarkers and establishment of boosting decision tree model for diagnosis of systemic lupus erythematosus. Rheumatology (Oxford) 2009; 48: 626–31. doi:10.1093/rheumatology/kep058
96 Huang Z, Shi Y, Cai B, et al. Promising diagnostic model for systemic lupus erythematosus using Proteomic fingerprint technology. Sichuan Da Xue Xue Bao Yi Xue Ban 2009; 40: 499–503.
97 Jiang Z, Shao M, Dai X, et al. Identification of diagnostic biomarkers in systemic lupus erythematosus based on Bioinformatics analysis and machine learning. Front Genet 2022; 13: 865559. doi:10.3389/fgene.2022.865559
98 Jorge A, Castro VM, Barnado A, et al. Identifying lupus patients in electronic health records: development and validation of machine learning Algorithms and application of rule-based Algorithms. Semin Arthritis Rheum 2019; 49: 84–90. doi:10.1016/j.semarthrit.2019.01.002
99 Leventhal EL, Daamen AR, Grammer AC, et al. An interpretable machine learning pipeline based on Transcriptomics predicts phenotypes of lupus patients. iScience 2023; 26: 108042. doi:10.1016/j.isci.2023.108042
100 Li Y, Ge Z, Zhang Z, et al. Broad learning enhanced (1)H-MRS for early diagnosis of neuropsychiatric systemic lupus erythematosus. Computational and Mathematical Methods in Medicine 2020; 2020: 1–13. doi:10.1155/2020/8874521
101 Li Y, Ma C, Liao S, et al. Combined Proteomics and single cell RNA-sequencing analysis to identify biomarkers of disease diagnosis and disease exacerbation for systemic lupus erythematosus. Front Immunol 2022; 13: 969509. doi:10.3389/fimmu.2022.969509
102 Luo X, Piao S, Li H, et al. Multi-lesion Radiomics model for discrimination of relapsing-remitting multiple sclerosis and neuropsychiatric systemic lupus erythematosus. Eur Radiol 2022; 32: 5700–10. doi:10.1007/s00330-022-08653-2
103 Ma Y, Chen J, Wang T, et al. Accurate machine learning model to diagnose chronic autoimmune diseases utilizing information from B cells and monocytes. Front Immunol 2022; 13: 870531. doi:10.3389/fimmu.2022.870531
104 Martin-Gutierrez L, Peng J, Thompson NL, et al. Stratification of patients with Sjogren’s syndrome and patients with systemic lupus erythematosus according to two shared immune cell signatures, with potential therapeutic implications. Arthritis Rheumatol 2021; 73: 1626–37. doi:10.1002/art.41708
105 Martorell-Marugán J, Chierici M, Jurman G, et al. Differential diagnosis of systemic lupus erythematosus and Sjogren’s syndrome using machine learning and multi-Omics data. Comput Biol Med 2023; 152: 106373. doi:10.1016/j.compbiomed.2022.106373
106 Meng XW, Cheng ZL, Lu ZY, et al. Mx2: identification and systematic mechanistic analysis of a novel immune-related biomarker for systemic lupus erythematosus. Front Immunol 2022; 13: 978851. doi:10.3389/fimmu.2022.978851
107 Mondal S, Singh MP, Kumar A, et al. Rapid molecular evaluation of human kidney tissue sections by in situ mass Spectrometry and machine learning to classify the nephrotic syndrome. J Proteome Res 2023; 22: 967–76. doi:10.1021/acs.jproteome.2c00768
108 Ormseth MJ, Solus JF, Sheng Q, et al. Development and validation of a Microrna panel to differentiate between patients with rheumatoid arthritis or systemic lupus erythematosus and controls. J Rheumatol 2020; 47: 188–96. doi:10.3899/jrheum.181029
109 Scully M, Anderson B, Lane T, et al. An automated method for Segmenting white matter lesions through multi-level morphometric feature classification with application to lupus. Front Hum Neurosci 2010; 4: 27. doi:10.3389/fnhum.2010.00027
110 Simos NJ, Dimitriadis SI, Kavroulakis E, et al. Quantitative identification of functional Connectivity disturbances in neuropsychiatric lupus based on resting-state fMRI: A robust machine learning approach. Brain Sci 2020; 10: 11.: 777. doi:10.3390/brainsci10110777
111 Tan G, Huang B, Cui Z, et al. A noise-immune reinforcement learning method for early diagnosis of neuropsychiatric systemic lupus erythematosus. Math Biosci Eng 2022; 19: 2219–39. doi:10.3934/mbe.2022104
112 Tang Y, Zhang W, Zhu M, et al. Lupus nephritis pathology prediction with clinical indices. Sci Rep 2018; 8: 10231. doi:10.1038/s41598-018-28611-7
113 Toro-Domínguez D, Martorell-Marugán J, Martinez-Bueno M, et al. Scoring personalized molecular portraits identify systemic lupus erythematosus subtypes and predict individualized drug responses, Symptomatology and disease progression. Brief Bioinform 2022; 23: bbac332. doi:10.1093/bib/bbac332
114 Turner CA, Jacobs AD, Marques CK, et al. Word2Vec inversion and traditional text classifiers for Phenotyping lupus. BMC Med Inform Decis Mak 2017; 17: 126. doi:10.1186/s12911-017-0518-1
115 Usategui I, Barbado J, Torres AM, et al. Machine learning, a new tool for the detection of immunodeficiency patterns in systemic lupus erythematosus. J Investig Med 2023; 71: 742–52. doi:10.1177/10815589231171404
116 Wang D-C, Xu W-D, Wang S-N, et al. Lupus nephritis or not? A simple and clinically friendly machine learning pipeline to help diagnosis of lupus nephritis. Inflamm Res 2023; 72: 1315–24. doi:10.1007/s00011-023-01755-7
117 Wang F, Miao H, Pei Z, et al. Serological, Fragmentomic, and epigenetic characteristics of cell-free DNA in patients with lupus nephritis. Front Immunol 2022; 13: 1001690. doi:10.3389/fimmu.2022.1001690
118 Wang L, Yang Z, Yu H, et al. Predicting diagnostic gene expression profiles associated with immune infiltration in patients with lupus nephritis. Front Immunol 2022; 13: 839197. doi:10.3389/fimmu.2022.839197
119 Yu S-C, Chang K-C, Wang H, et al. Distinguishing lupus Lymphadenitis from Kikuchi disease based on Clinicopathological features and C4D immunohistochemistry. Rheumatology (Oxford) 2021; 60: 1543–52. doi:10.1093/rheumatology/keaa524
120 Zhong Y, Zhang W, Hong X, et al. Screening biomarkers for systemic lupus erythematosus based on machine learning and exploring their expression correlations with the ratios of various immune cells. Front Immunol 2022; 13: 873787. doi:10.3389/fimmu.2022.873787
121 Zhuo Z, Su L, Duan Y, et al. Different patterns of cerebral perfusion in SLE patients with and without neuropsychiatric manifestations. Hum Brain Mapp 2020; 41: 755–66. doi:10.1002/hbm.24837
122 Wang M, Liang Y, Hu Z, et al. Lupus nephritis diagnosis using enhanced moth flame algorithm with support vector machines. Computers in Biology and Medicine 2022; 145: 105435. doi:10.1016/j.compbiomed.2022.105435
123 Heming M, Müller-Miny L, Rolfes L, et al. Supporting the differential diagnosis of connective tissue diseases with neurological involvement by blood and cerebrospinal fluid flow Cytometry. J Neuroinflammation 2023; 20: 46. doi:10.1186/s12974-023-02733-w
124 Ni J, Chen C, Wang S, et al. Novel CSF biomarkers for diagnosis and integrated analysis of neuropsychiatric systemic lupus erythematosus: based on antibody profiling. Arthritis Res Ther 2023; 25: 165. doi:10.1186/s13075-023-03146-z
125 Raghunath S, Glikmann-Johnston Y, Vincent FB, et al. Patterns and prevalence of cognitive dysfunction in systemic lupus erythematosus. J Int Neuropsychol Soc 2023; 29: 421–30. doi:10.1017/S1355617722000418
126 Barraclough M, Erdman L, Diaz-Martinez JP, et al. Systemic lupus erythematosus phenotypes formed from machine learning with a specific focus on cognitive impairment. Rheumatology (Oxford) 2023; 62: 3610–8. doi:10.1093/rheumatology/keac653
127 Dong C, Yang N, Zhao R, et al. SVM-based model combining patients' reported outcomes and lymphocyte phenotypes of depression in systemic lupus erythematosus. Biomolecules 2023; 13: 723. doi:10.3390/biom13050723
128 Gu X-X, Jin Y, Fu T, et al. Relevant characteristics analysis using natural language processing and machine learning based on phenotypes and T-cell Subsets in systemic lupus erythematosus patients with anxiety. Front Psychiatry 2021; 12: 793505. doi:10.3389/fpsyt.2021.793505
129 Tay SH, Stephenson MC, Allameen NA, et al. Combining Multimodal magnetic resonance brain imaging and machine learning to unravel Neurocognitive function in non-neuropsychiatric systemic lupus erythematosus. Rheumatology 2024; 63: 414–22. doi:10.1093/rheumatology/kead221
130 Rumetshofer T, Inglese F, de Bresser J, et al. Tract-based white matter Hyperintensity patterns in patients with systemic lupus erythematosus using an Unsupervised machine learning approach. Sci Rep 2022; 12: 21376. doi:10.1038/s41598-022-25990-w
131 Ayoub I, Wolf BJ, Geng L, et al. Prediction models of treatment response in lupus nephritis. Kidney Int 2022; 101: 379–89. doi:10.1016/j.kint.2021.11.014
132 Chen Y, Huang S, Chen T, et al. Machine learning for prediction and risk stratification of lupus nephritis renal flare. Am J Nephrol 2021; 52: 152–60. doi:10.1159/000513566
133 Qin X, Xia L, Zhu C, et al. Noninvasive evaluation of lupus nephritis activity using a Radiomics machine learning model based on ultrasound. J Inflamm Res 2023; 16: 433–41. doi:10.2147/JIR.S398399
134 Stojanowski J, Konieczny A, Rydzyńska K, et al. Artificial neural network - an effective tool for predicting the lupus nephritis outcome. BMC Nephrol 2022; 23: 381. doi:10.1186/s12882-022-02978-2
135 Wang X, Fu S, Yu J, et al. Renal interferon-inducible protein 16 expression is associated with disease activity and prognosis in lupus nephritis. Arthritis Res Ther 2023; 25: 112. doi:10.1186/s13075-023-03094-8
136 Tang C, Zhang S, Teymur A, et al. V-set immunoglobulin domain-containing protein 4 as a novel serum biomarker of lupus nephritis and renal pathology activity. Arthritis Rheumatol 2023; 75: 1573–85. doi:10.1002/art.42545
137 Alves P, Bandaria J, Leavy MB, et al. Validation of a machine learning approach to estimate systemic lupus erythematosus disease activity index score categories and application in a real-world Dataset. RMD Open 2021; 7: e001586. doi:10.1136/rmdopen-2021-001586
138 Andreoletti G, Lanata CM, Trupin L, et al. Transcriptomic analysis of immune cells in a multi-ethnic cohort of systemic lupus erythematosus patients identifies Ethnicity- and disease-specific expression signatures. Commun Biol 2021; 4: 488. doi:10.1038/s42003-021-02000-9
139 Ceccarelli F, Olivieri G, Sortino A, et al. Comprehensive disease control in systemic lupus erythematosus. Semin Arthritis Rheum 2021; 51: 404–8. doi:10.1016/j.semarthrit.2021.02.005
140 Han J, Zhou Z, Zhang R, et al. Fucosylation of anti-dsDNA Igg1 correlates with disease activity of treatment-naive systemic lupus erythematosus patients. EBioMedicine 2022; 77: 103883. doi:10.1016/j.ebiom.2022.103883
141 Jupe ER, Lushington GH, Purushothaman M, et al. Tracking of systemic lupus erythematosus (SLE) Longitudinally using Biosensor and patient-reported data: A report on the fully decentralized mobile study to measure and predict lupus disease activity using Digital signals-the OASIS study. BioTech 2023; 12: 62. doi:10.3390/biotech12040062
142 Kan H, Nagar S, Patel J, et al. Longitudinal treatment patterns and associated outcomes in patients with newly diagnosed systemic lupus erythematosus. Clin Ther 2016; 38: 610–24. doi:10.1016/j.clinthera.2016.01.016
143 Labonte AC, Kegerreis B, Geraci NS, et al. Identification of alterations in macrophage activation associated with disease activity in systemic lupus erythematosus. PLoS One 2018; 13: e0208132. doi:10.1371/journal.pone.0208132
144 Maffi M, Tani C, Cascarano G, et al. “Which extra-renal flare is "difficult to treat" in systemic lupus erythematosus? A one-year longitudinal study comparing traditional and machine learning approaches”. Rheumatology (Oxford) 2024; 63: 376–84. doi:10.1093/rheumatology/kead166
145 Rector I, Owen KA, Bachali P, et al. Differential regulation of the interferon response in systemic lupus erythematosus distinguishes patients of Asian ancestry. RMD Open 2023; 9: e003475. doi:10.1136/rmdopen-2023-003475
146 Wang D-C, Xu W-D, Qin Z, et al. Systemic lupus erythematosus with high disease activity identification based on machine learning. Inflamm Res 2023; 72: 1909–18. doi:10.1007/s00011-023-01793-1
147 Wang Y, Shu W, Lin S, et al. Hollow cobalt oxide/carbon hybrids aid metabolic Encoding for active systemic lupus erythematosus during pregnancy. Small 2022; 18: e2106412. doi:10.1002/smll.202106412 Available: https://onlinelibrary.wiley.com/toc/16136829/18/11
148 Hoi A, Nim HT, Koelmeyer R, et al. Algorithm for calculating high disease activity in SLE. Rheumatology 2021; 60: 4291–7. doi:10.1093/rheumatology/keab003
149 Helget LN, Dillon DJ, Wolf B, et al. Development of a lupus nephritis suboptimal response prediction tool using renal histopathological and clinical laboratory variables at the time of diagnosis. Lupus Sci Med 2021; 8: e000489. doi:10.1136/lupus-2021-000489
150 Maeda S, Hashimoto H, Maeda T, et al. High-dimensional analysis of T-cell profiling variations following Belimumab treatment in systemic lupus erythematosus. Lupus Sci Med 2023; 10: e000976. doi:10.1136/lupus-2023-000976
151 Wang DD, Li YF, Zhang C, et al. Predicting the effect of sirolimus on disease activity in patients with systemic lupus erythematosus using machine learning. J Clin Pharm Ther 2022; 47: 1845–50. doi:10.1111/jcpt.13778
152 Wolf BJ, Spainhour JC, Arthur JM, et al. Development of biomarker models to predict outcomes in lupus nephritis. Arthritis & Rheumatology 2016; 68: 1955–63. doi:10.1002/art.39623 Available: https://acrjournals.onlinelibrary.wiley.com/toc/23265205/68/8
153 Coelewij L, Waddington KE, Robinson GA, et al. Serum Metabolomic signatures can predict Subclinical Atherosclerosis in patients with systemic lupus erythematosus. Arterioscler Thromb Vasc Biol 2021; 41: 1446–58. doi:10.1161/ATVBAHA.120.315321
154 Liu C, Zhou Y, Zhou Y, et al. Identification of crucial genes for predicting the risk of Atherosclerosis with system lupus erythematosus based on comprehensive Bioinformatics analysis and machine learning. Comput Biol Med 2023; 152: 106388. doi:10.1016/j.compbiomed.2022.106388
155 Luo Z, Lu G, Yang Q, et al. Identification of shared immune cells and immune-related Co-disease genes in chronic heart failure and systemic lupus erythematosus based on Transcriptome sequencing
156 Matthiesen R, Lauber C, Sampaio JL, et al. Shotgun mass Spectrometry-based lipid profiling identifies and distinguishes between chronic inflammatory diseases. EBioMedicine 2021; 70: 103504. doi:10.1016/j.ebiom.2021.103504
157 Peng J, Dönnes P, Ardoin SP, et al. Atherosclerosis progression in the APPLE trial can be predicted in young people with juvenile-onset systemic lupus erythematosus using a novel lipid Metabolomic signature. Arthritis & Rheumatology 2023. doi:10.1002/art.42722
158 Ravenell RL, Kamen DL, Fleury TJ, et al. Premature Atherosclerosis is associated with Hypovitaminosis D and angiotensin-converting enzyme inhibitor non-use in lupus patients. The American Journal of the Medical Sciences 2012; 344: 268–73. doi:10.1097/MAJ.0b013e31823fa7d9
159 Hu Z, Wu L, Lin Z, et al. Prevalence and associated factors of electrocardiogram abnormalities in patients with systemic lupus erythematosus: A machine learning study. Arthritis Care & Research 2022; 74: 1640–8. doi:10.1002/acr.24612 Available: https://acrjournals.onlinelibrary.wiley.com/toc/21514658/74/10
160 Deng Y, Zhou Y, Shi J, et al. Potential genetic biomarkers predict adverse pregnancy outcome during early and mid-pregnancy in women with systemic lupus erythematosus. Front Endocrinol 2022; 13: 957010. doi:10.3389/fendo.2022.957010
161 Fazzari MJ, Guerra MM, Salmon J, et al. Adverse pregnancy outcomes in women with systemic lupus erythematosus: can we improve predictions with machine learning Lupus Sci Med 2022; 9: e000769. doi:10.1136/lupus-2022-000769
162 Hao X, Zheng D, Khan M, et al. Machine learning models for predicting adverse pregnancy outcomes in pregnant women with systemic lupus erythematosus. Diagnostics (Basel) 2023; 13: 612. doi:10.3390/diagnostics13040612
163 Tang H, Poynton MR, Hurdle JF, et al. Predicting three-year kidney graft survival in recipients with systemic lupus erythematosus. ASAIO J 2011; 57: 300–9. doi:10.1097/MAT.0b013e318222db30
164 Ceccarelli F, Sciandrone M, Perricone C, et al. Biomarkers of erosive arthritis in systemic lupus erythematosus: application of machine learning models. PLoS One 2018; 13: e0207926. doi:10.1371/journal.pone.0207926
165 Liang X, Peng Z, Lin Z, et al. Identification of Prognostic genes for breast cancer related to systemic lupus erythematosus by integrated analysis and machine learning. Immunobiology 2023; 228. doi:10.1016/j.imbio.2023.152730
166 Huang T, Liu S, Huang J, et al. Prediction and associated factors of hypothyroidism in systemic lupus erythematosus: a cross-sectional study based on multiple machine learning Algorithms. Curr Med Res Opin 2022; 38: 229–35. doi:10.1080/03007995.2021.2015156
167 Wang DC, Tang YY, He CS, et al. Exploring machine learning methods for predicting systemic lupus erythematosus with herpes. Int J Rheum Dis 2023; 26: 2047–54. doi:10.1111/1756-185X.14869
168 Robinson GA, Peng J, Pineda-Torra I, et al. Metabolomics defines complex patterns of Dyslipidaemia in juvenile-SLE patients associated with inflammation and potential cardiovascular disease risk. Metabolites 2021; 12: 3. doi:10.3390/metabo12010003
169 Grovu R, Huo Y, Nguyen A, et al. Machine learning: predicting hospital length of stay in patients admitted for lupus flares. Lupus 2023; 32: 1418–29.: 09612033231206830. doi:10.1177/09612033231206830
170 Beam AL, Drazen JM, Kohane IS, et al. Artificial intelligence in medicine. N Engl J Med 2023; 388: 1220. doi:10.1056/NEJMe2206291
171 Kingsmore KM, Lipsky PE. Recent advances in the use of machine learning and artificial intelligence to improve diagnosis, predict flares, and enrich clinical trials in lupus. Curr Opin Rheumatol 2022; 34: 374–81. doi:10.1097/BOR.0000000000000902
172 Chen IY, Pierson E, Rose S, et al. Ethical machine learning in Healthcare. Annu Rev Biomed Data Sci 2021; 4: 123–44. doi:10.1146/annurev-biodatasci-092820-114757
173 Kaye J. The tension between data sharing and the protection of privacy in Genomics research. Annu Rev Genomics Hum Genet 2012; 13: 415–31. doi:10.1146/annurev-genom-082410-101454
174 Gossec L, Kedra J, Servy H, et al. EULAR points to consider for the use of big data in rheumatic and musculoskeletal diseases. Ann Rheum Dis 2020; 79: 69–76. doi:10.1136/annrheumdis-2019-215694
175 Buie J, McMillan E, Kirby J, et al. Disparities in lupus and the role of social determinants of health: Current state of knowledge and directions for future research. ACR Open Rheumatol 2023; 5: 454–64. doi:10.1002/acr2.11590
176 Scofield RH, Sharma R, Aberle T, et al. Impact of race and Ethnicity on family participation in systemic lupus erythematosus genetic studies. Front Lupus 2023; 1: 1100534. doi:10.3389/flupu.2023.1100534
177 Sheikh SZ, Wanty NI, Stephens J, et al. The state of lupus clinical trials: minority participation needed. J Clin Med 2019; 8: 1245. doi:10.3390/jcm8081245
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024 Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/ This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ . Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Artificial intelligence and machine learning applications are emerging as transformative technologies in medicine. With greater access to a diverse range of big datasets, researchers are turning to these powerful techniques for data analysis. Machine learning can reveal patterns and interactions between variables in large and complex datasets more accurately and efficiently than traditional statistical methods. Machine learning approaches open new possibilities for studying SLE, a multifactorial, highly heterogeneous and complex disease. Here, we discuss how machine learning methods are rapidly being integrated into the field of SLE research. Recent reports have focused on building prediction models and/or identifying novel biomarkers using both supervised and unsupervised techniques for understanding disease pathogenesis, early diagnosis and prognosis of disease. In this review, we will provide an overview of machine learning techniques to discuss current gaps, challenges and opportunities for SLE studies. External validation of most prediction models is still needed before clinical adoption. Utilisation of deep learning models, access to alternative sources of health data and increased awareness of the ethics, governance and regulations surrounding the use of artificial intelligence in medicine will help propel this exciting field forward.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details

1 University of Calgary Cumming School of Medicine, Calgary, Alberta, Canada
2 Computational Precision Health, University of California Berkeley and University of California San Francisco, Berkeley, California, USA; Electrical Engineering and Computer Science, University of California Berkeley, Berkeley, California, USA
3 University of Calgary Cumming School of Medicine, Calgary, Alberta, Canada; McCaig Institute for Bone and Joint Health, Calgary, Alberta, Canada