Systemic lupus in the era of machine learning

Full text

Translate

Turn on search term navigation

Correspondence to Dr May Y Choi; [email protected]

WHAT IS ALREADY KNOWN ON THIS TOPIC

Most machine learning models developed for SLE to date have been directed towards elucidating disease pathogenesis, improving diagnosis, and predicting disease-related outcomes.

WHAT THIS STUDY ADDS

This study provides an overview of machine learning techniques to discuss current gaps, challenges, and opportunities for SLE research.Most SLE machine learning studies under-report key details of the model development and/or have not been externally validated to ensure they are effective, reliable, and safe to adopt into clinical practice.
The application of more advanced machine learning algorithms such as deep learning and the utilisation of complex, alternative datasets including images, are increasing among SLE studies.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

As machine learning continues to provide unprecedented opportunities to deliver transformative discoveries in SLE research and practice, researchers need to stay informed of the ethical, governance, and regulatory considerations around their use.

Introduction

Tremendous progress in our understanding of SLE pathogenesis, diagnosis and management has been made over the past 75 years, with most studies relying on traditional statistical techniques to evaluate and test hypotheses. While these approaches are still widely used, many researchers are turning to machine learning (ML) as a complementary method for assessing patterns that are not readily tested using traditional statistics. In the last 5 years alone, there has been an explosion of studies that have leveraged the power of ML to study SLE patient identification, risk prediction, diagnosis, disease subtype classification, progression, outcomes, monitoring and management. While it may seem that ML is the new shiny toy of the 21st century, the term ‘artificial intelligence’ (AI) was first described in 1955, the same year that antimalarial drugs were approved by the US Food and Drug Administration. The impact of AI on medicine has recently re-emerged as a valuable approach because of the enormous growth in computing power and increasing availability of extensive and comprehensive ‘big data’ for analysis. As SLE researchers continue to amass more data on SLE, a complex, multifactorial and heterogeneous disease, traditional statistical techniques may no longer be the most effective or efficient methods, particularly in this era focused on precision medicine. In this review, we will provide an overview of ML and its current and future potential applications to SLE research.

Why ML in SLE?

Although ML and AI are often used interchangeably, ML is a subset of AI (figure 1). AI is the development of machines and systems that can imitate tasks that normally require intelligent human behaviour. ML algorithms allow computers to perform specific tasks by learning from the data rather than being explicitly programmed with instructions such as traditional statistical tests. Some other important differences between ML and traditional statistics are described in table 1. Understanding the advantages and disadvantages of both approaches may help inform one’s decision on which methods to use. In general, if the purpose of a project is to create an algorithm that can make predictions for a particular outcome and a large dataset is available, an ML approach may be a better option. If the purpose is to examine a relationship between variables or make inferences from a smaller dataset, then a traditional statistical model may be the better approach.

View Image - Figure 1. Categories of machine learning. Machine learning is a type of artificial intelligence. Within machine learning, there are three main categories: supervised, unsupervised and reinforcement learning. Deep learning is a subtype of machine learning that can involve supervising, unsupervised or reinforcement learning. Within each category, there are many different types of machine learning algorithms. Many factors can influence the choice of a specific algorithm. These include amount and type of data (eg, if images or videos are included in the data, a neural network will probably be preferred); how important interpretability is to your context (eg, decision trees or regression models are typically more interpretable, although this is an active area of research); and any computer memory or computational restriction. As no particular model consistently performs better than the others, it is typical to develop several models using multiple algorithms and then compare their performance using different metrics. CNN, convolutional neural network; CVD, cardiovascular disease; LASSO, least absolute shrinkage and selection operator; RNN, recurrent neural network.

Figure 1. Categories of machine learning. Machine learning is a type of artificial intelligence. Within machine learning, there are three main categories: supervised, unsupervised and reinforcement learning. Deep learning is a subtype of machine learning that can involve supervising, unsupervised or reinforcement learning. Within each category, there are many different types of machine learning algorithms. Many factors can influence the choice of a specific algorithm. These include amount and type of data (eg, if images or videos are included in the data, a neural network will probably be preferred); how important interpretability is to your context (eg, decision trees or regression models are typically more interpretable, although this is an active area of research); and any computer memory or computational restriction. As no particular model consistently performs better than the others, it is typical to develop several models using multiple algorithms and then compare their performance using different metrics. CNN, convolutional neural network; CVD, cardiovascular disease; LASSO, least absolute shrinkage and selection operator; RNN, recurrent neural network.

Table 1

Key differences in machine learning and traditional statistical approaches

Machine learning	Traditional statistics
Large dataset.	Small to mid-sized dataset.
Low interpretability.	High interpretability.
Can include 10s–1000s of variables in a single model.	Limited in number of variables included in a single model (<10).
Certain models have considerable flexibility around data distribution and can model many different types of non-linear relationships.	Often assumes normality of data and/or linear relationships.
Can tailor which performance parameter to maximise (accuracy, sensitivity, etc).	Maximises accuracy.
Uses a portion of the data to develop the model.	Uses all available data to develop the model.
Can capture patterns across multiple variables.	Limited ability to capture patterns across multiple variables (interaction terms).
Can adjust the generalisability/penalty for the coefficients to prevent overfitting.	Unable to adjust the generalisability/penalty for the coefficients to prevent overfitting.

In this technological age, researchers have greater access to large datasets of different types of information on patients with SLE. Types of datasets in SLE include demographic, clinical, histological, genetic and immune-related biomarkers (eg, autoantibodies, immune cell types, cytokines) in biological fluids, electronic medical records (EMR), images (eg, MRI, ultrasound) and other ‘omics’ (eg, proteomics, metabolomics). While this presents an important opportunity to study a remarkably heterogeneous and complex disease like SLE, the volume and density of data can also make it challenging to draw statistical inferences from large datasets, especially given the potential to identify false positive associations. Hence, ML is a more efficient and accurate approach to understanding the patterns in complex datasets.

The more technical aspects of ML as they apply to systemic autoimmune rheumatic diseases are reviewed in greater detail elsewhere.^1–5 In brief, the ML categories that are often applied to study medical data are supervised and unsupervised. In supervised ML, or a task-driven approach, a ‘training dataset’ is used to develop an algorithm to recognise patterns that are associated with ‘labels’. This algorithm is then tested in a ‘test dataset’ to see how well it performs. In unsupervised ML or a data-driven approach, the training data are ‘unlabeled’, and the algorithm attempts to identify patterns within the dataset. In addition to supervised and unsupervised ML, another less commonly applied type of ML is called reinforcement learning. This type of ML is based on trial and error, with ‘reward’ or ‘punishment’ driving the learning process and skills acquisition. Within the three ML categories, a variety of ML algorithms exist, such as deep learning algorithms based on artificial neural networks (ANN), a modality that involves multiple layers of connected data, which can recognise complex patterns across different types of data, including images, video, and acoustic data. As we will discuss later, most ML studies in SLE employ both supervised and unsupervised models.

To determine which ML model to use, researchers consider several important factors including the characteristics of the input data (labelled vs unlabelled), the desired outcome (predicting a category or quantity), the modality of the data (eg, text, image) and volume of input data (figure 1). It is common to employ several algorithms and then compare their performance using different metrics to select the best model. For supervised models, it is ideal to assess the sensitivity, specificity, positive predictive value, negative predictive value, accuracy, F-score and area under the receiver operating characteristic curve (AUC), although particular emphasis may be placed on a subset of metrics depending on the context. The F-score is a single metric that combines the sensitivity and the positive predictive value of a model, and a high F-score requires good performance by both of those metrics. In traditional statistics, this is often referred to as the accuracy or line of best fit, but with ML, the F-score may be better suited to assess the training of a model. For unsupervised clusters, several techniques exist to ensure the number of identified clusters accurately reflects the data. These include the elbow method,⁶ the Bayes information criterion⁷ or a gap statistic.⁸ Once satisfied, statistical differences between clusters can be assessed using traditional methods, such as χ² tests or analysis of variance.

Building and evaluating the ML models occur as the final steps of an established ML pipeline (figure 2). After the data are collected, it is preprocessed (data cleaning, filling in missing data, etc), followed by data splitting, feature importance evaluation and selection, and then finally the ML models are built and evaluated. Feature selection is a process that allows the researcher to identify the best set of features that will help build optimised ML models (reviewed in ref ⁹). Feature selection is typically used with supervised algorithms, while dimensionality reduction is used in unsupervised clustering. Reports often use multiple supervised and unsupervised feature selection methods together. Examples of feature selection include recursive feature elimination,¹⁰ least absolute shrinkage and selection operator (LASSO)¹¹, and support vector machines (SVM).¹² These methods help identify covariables that are of greatest clinical and statistical importance.

View Image - Figure 2. Machine learning pipeline with consideration of the ethical, governance and regulation issues at every stage before clinical adoption of the model.

Figure 2. Machine learning pipeline with consideration of the ethical, governance and regulation issues at every stage before clinical adoption of the model.

ML reports in SLE

A scoping review was performed to summarise the major ML reports of SLE to date. A PubMed search of ‘lupus’ and ‘machine learning’ Medical Subject Heading terms was performed on 24 November 2023 (figure 3). One hundred and ninety-one publications from 1992 to 2023 were identified, of which 133 were original reports. The remaining publications were review articles or unrelated topics (eg, not SLE, non-human, not ML). Over the last 31 years, there has been an exponential increase in the number of ML and SLE-related publications, similar to trends reported in other autoimmune rheumatic diseases.^{1 5 13} As this was not a systematic review, we acknowledge that we may have omitted some studies related to ML and SLE. However, we believe that we have captured most publications allowing for an accurate representation of the field and an in-depth discussion in our paper.

View Image - Figure 3. Number of SLE-related studies using machine learning methods. There has been an exponential growth of reports over the past 31 years based on PubMed database of publications when we searched ‘machine learning’ and ‘lupus’. The majority of reports were related to diagnosis (including neuropsychiatric and dermatological manifestations), followed by disease activity (including renal flares, extrarenal flares and treatment response), complications, pathogenesis, and mixed reports.

Figure 3. Number of SLE-related studies using machine learning methods. There has been an exponential growth of reports over the past 31 years based on PubMed database of publications when we searched ‘machine learning’ and ‘lupus’. The majority of reports were related to diagnosis (including neuropsychiatric and dermatological manifestations), followed by disease activity (including renal flares, extrarenal flares and treatment response), complications, pathogenesis, and mixed reports.

As ML research becomes increasingly recognised and valued in SLE, it is imperative that it is conducted in a methodologically rigorous manner to yield meaningful and useful results to relevant stakeholders and end users. Since ML methods are relatively new to the field, assessing the quality or technical aspects of these reports may be challenging to most non-ML researchers. A recent systematic review by Munguía-Realpozo et al¹⁴ assessed 45 SLE reports that used ML to build diagnostic and/or predictive algorithms and determined whether they adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting standards.¹⁵ The review concluded that most reports were deficient in multiple domains of the TRIPOD recommendations, often under-reporting relevant details about their data preprocessing, model-building process, model specification and model performance.

In this scoping review, we will discuss ML approaches used in SLE reports following the outline of an ML pipeline (figure 2). While the aim of the study was not to systematically evaluate the reporting adherences of these reports, in general, we found similar limitations identified by Munguía-Realpozo et al.¹⁴ This highlights that there is a need to improve transparency and reporting of prediction models in future ML SLE studies.

Data collection

Given that SLE is an uncommon disease, it was not unexpected that the sample sizes for most reports (median 158 patients with SLE (IQR 61–681)) were relatively small. Overfitting and inappropriate generalisation from a small training dataset are important limitations of ML.¹⁶ Twenty-five (18.7%) reports evaluated greater than 1000 patients and seven (5.2%) reports assessed greater than 5000 patients. Most of these larger reports used EMRs and administrative databases to identify patients with SLE, recognising that these types of data may be limited by diagnostic misclassification.^17–21 Many reports experienced ‘class imbalance’, where the SLE group sample size was considerably smaller compared with healthy controls, potentially biasing ML in favour of the more prevalent class. To address this, some reports used generative adversarial networks^{22 23} and Synthetic Minority Oversampling TEchnique (SMOTE)^{20 24} to generate synthetic data.

The data density for most SLE reports does not derive from the patient cohort size but from the large number of variables on each patient considered for the ML models. Types of data used included demographic (n=43 reports) and clinical (n=51) data from cohort registries and several using EMRs (n=13). Data from biopsies included renal (n=6) and lymph node tissue (n=1). Biomarker data included autoantibodies (n=37), immune cell subtypes (eg, CD4+ and CD8+ T cells) (n=26) and other immune markers (eg, complement levels, platelet counts) (n=24), cytokines (n=8), genetics and transcriptomics (n=47), urinary markers (n=9), proteomics (n=5) and lipidomics/metabolomics (n=11). The application of ML to genetics and transcriptomics (eg, RNA sequencing (RNA-seq)) is particularly popular, largely due to the flexibility of ML for managing the vast amount of data obtained from each patient. The feature selection and dimensionality reduction techniques of ML offer a means to handle the large number of potentially relevant covariables. Alternative datasets included images (brain MRI for neuropsychiatric SLE (NPSLE) (n=9), clinical images of cutaneous lupus erythematosus (LE) (n=1) and funduscopic images for lupus retinopathy (n=1)), EKG abnormalities (n=1) and meteorological/environmental indicators (eg, air humidity, air pressure, sulfur dioxide, nitrogen dioxide, particle pollution from fine particulates) (n=2).

Data preprocessing and splitting

As identified by Munguía-Realpozo et al,¹⁴ handling of missing data was a major limitation of SLE reports. Median imputation and removal of data to use complete cases were common methods. Four reports used multiple imputation by chained equations,²⁵ a more advanced imputation methodology, and six reports used SMOTE²⁴ to address class imbalance with respect to missing data. Some ML models such as extreme gradient boosting (XGBoost)²⁶ are able to address missing data due to built-in imputation functions.

Accurate data labelling is particularly important for diseases that are heterogeneous with a fluctuating and variable disease course, such as SLE. Identification of SLE cases using EMRs may be inaccurate and inefficient as it relies on coding systems such as the International Classification of Diseases (ICD), which historically has poor diagnostic specificity.²⁷ Similarly, identification of SLE-related manifestations may be challenging given the wide range of features and lack of specific administrative codes for different phenotypes of presentation. Some manifestations are difficult to distinguish between primary and secondary features of SLE, for example, NPSLE versus secondary to other conditions (eg, infections or metabolic disturbances). Regardless of whether a model was developed through traditional means or by ML, any errors in data labelling in the preprocessing stage that are then used to train the model will continue to mislabel future cases. To overcome this, one ML SLE study used a technique called ‘noisy labeling’, where the training labels were created using EMR data based on a threshold of multiple ICD-9 codes, followed by model testing against expert clinician-labelled data with good performance metrics.²⁸

Most reports split a single dataset into three groups: training, validation, and a testing set. While this is an acceptable approach to internally validate a model, an external validation dataset with an independent cohort of patients is needed to ensure replicability and generalisability of the model before clinical adoption and to assess the degree of potential model overfitting.²⁹ We discuss external validation separately below.

Feature selection and dimension reduction

Feature selection methods were primarily random forest (RF) (n=41), followed by LASSO (n=21), and SVM (n=16). Several reports also used filter methods such as relief-based feature selection (n=7) and mutual information (n=2), which were often performed in reports that used genetic datasets. Dimensionality reduction techniques were applied (n=32), which included principal component analysis (PCA) (n=19), t-distributed stochastic neighbour embedding³⁰ (n=9), and Uniform Manifold Approximation and Projection³¹ (n=7).

Model development

Most reports (n=102) developed one or more prediction algorithms. The remaining reports (n=31) focused only on the identification of SLE clusters or features, for example, biomarkers. For supervised models, the most common technique was RF (n=49), followed by SVM (n=42), logistic regression (LR) (n=42), ANNs³² (n=24), XGBoost (n=20), LASSO (n=17), decision trees (n=16), Naïve Bayes³³ (n=14), and k-nearest neighbour (n=13). A few reports used a gradient-boosted tree,³⁴ classification and regression tree³⁵ and light gradient-boosting machine.³⁶ For unsupervised models, primarily clustering and dimensionality reduction were performed, for example, hierarchica (n=9) and k-means clustering (n=9).

There was an increasing number of SLE reports using deep learning methods over time; in this review, 34 such reports were identified. Most of the reports (n=23) included a simple neural network with one or two hidden layers as a comparison between other techniques. As little hyperparameter optimisation was done, these ANNs often were outperformed by models such as RF, SVM and XGBoost. Even with tasks such as natural language processing which commonly use deep learning models like recurrent neural networks (RNN),³² one study found that RF outperformed deep learning models when proper preprocessing and feature selection were performed.³⁷

RNN and its derivatives (long short-term memory (LSTM)³⁸ and gated recurrent unit)³⁹ are typically used in natural language processing, time series data and large image data. In terms of SLE reports, these models were used to analyse EMR data for hospitalisation risk^{20 40–42} and image data for SLE diagnosis.^{43 44} However, no report in our review used large language models and attention to text data was uncommon, highlighting the need for more complex models in analysing text data from electronic health records.

Five reports used convolutional neural networks (CNN)⁴⁵ on image data with topics ranging from NPSLE diagnosis from MRI images,⁴⁶ diagnosis of SLE retinopathy from funduscopic images,²³ diagnosis of cutaneous lupus from lesion images,⁴⁷ segmentation of staining from lupus nephritis (LN) pathology images⁴⁸ and segmentation of glomeruli on LN biopsy images.⁴³ Three of the reports used a deep learning technique called Grad-CAM⁴⁹ that identifies the region of an image that will contribute the most to the final model. As SLE imaging data can be challenging to obtain with large enough numbers for robust ML reports, an ML technique called transfer learning was used to create powerful discriminative models, even with sparse data. Four reports in this review used this method to work with the smaller datasets.^{23 43 47 48} Liu et al²³ posited that transfer learning using diabetic retinopathy funduscopic images could serve as a strong base model for lupus retinopathy prediction as the base model would be more ‘accustomed’ to pathological fundus images. This principle could be applied to other areas of SLE research such as the diagnosis of cutaneous LE through images of other skin lesions from similar but more common diseases including psoriasis.

Model evaluation

The approach taken by most reports was to develop multiple ML models and then select the best model, usually based on the AUC. Other performance metrics including the F-score were not always used or even reported. This is similar to the findings by Munguía-Realpozo et al, where only 21 (46.7%) reported AUC as their main performance metric, seven (15.6%) reported accuracy as their performance metric and the remaining used a combination of performance statistics.¹⁴ Five (11.1%) reports in their review did not report any performance metrics.

In our review, RF (n=25), SVM (n=16), XGBoost (n=10), LR (n=10) and LASSO (n=7) models were often reported as the best performing models, compared with more complex models like ANNs (n=4), LSTM (n=3) and CNN (n=2). As many of the datasets of the reports included in this review have features on the scale of 100s, we expect that simpler models would perform better compared with gradient boosting and neural networks that require larger datasets, and where performance is enhanced with multiple layers of data. Additionally, models such as RF and LASSO have capabilities for feature importance, which helps with explainability such as identifying important clinical and genetic biomarkers for future research.

External validation

Overfitting of ML models to the training datasets should be evaluated using optimism-adjusted measures. Although these can be approximated using internal validation (eg, data splitting), they are more robustly assessed using external validation from a separate data cohort. This step ensures that the developed ML model is generalisable beyond the collected data alone. Only 15 reports in our review specified that they evaluated their model using an external cohort. External validation is particularly relevant for complex ‘black box’ ML models such as deep learning. In deep learning, the internal processes of the model are usually unknown or ‘hidden’. This makes it difficult to assess whether certain model features could be subject to selection or other biases that may affect the generalisability of the model.^{50 51}

Key SLE findings by ML reports

In our scoping review, ML models were used to elucidate disease pathogenesis (n=31),^{48 52–81} predict SLE diagnosis and identify cases (n=61),^{23 28 37 43 44 46 47 53 63 70 73 79 82–130} disease activity and treatment response (n=33),^{56 59 63 66 69 74 77 78 101 106 113 131–152} complications (n=22)^{18 21 40 53 83 147 153–168} and healthcare utilisation (n=6)^{17 20 41 42 142 169} (table 2). Refer to online supplemental table 1 for a glossary of key terms.

Table 2

Key SLE findings in machine learning studies

Type of study	References	Type of dataset	Types of machine learning models
Pathogenesis	Endotypes, immune dysregulation, genetic risk^{48 52–81}	Autoantibodies, clinical, cytokines, demographic, EMR, genetics, images (renal histopathology slides), immune cell types, urinary biomarkers	ANN, CART, CNN, decision tree, ElasticNet, GLM, gradient tree boosting, hierarchical clustering, k-means, KNN, LASSO, LR, Naïve Bayes, PCA, RF, RFE, ridge regression, SHAP, SMOTE, SVM, t-SNE, UMAP, XGBoost, other
Diagnosis	Differentiate between healthy controls, other autoimmune diseases (rheumatoid arthritis and systemic sclerosis)^{23 28 37 43 44 47 53 63 70 79 82–121}	Autoantibodies, biopsy, clinical, demographics, EMR, genetics, other immune markers and cell subtypes, other omics, images (MRI, funduscopic images), protein, urinary biomarkers	AdaBoost, ANN, CART, CNN, decision tree, GLM, gradient tree boosting, hierarchical clustering, KNN, k-means, LASSO, LDA, LGB, LR, Naïve Bayes, natural language, noisy labelling, PCA, RF, RFE, ridge regression, RNN, SHAP, SMOTE, SVM, XGBoost, UMAP, other
	Lupus nephritis^{85 107 112 117 118 122}	Autoantibodies, biopsy, clinical, cytokines, demographics, EMR, genetics, immune cell types, lipidomics, metabolomics, urinary biomarkers	ANN, CART, decision tree, hierarchical clustering, KNN, LASSO, LDA, LR, Naïve Bayes, RF, RFE, SVM, other
	Neuropsychiatric SLE including anxiety, depression, cognitive impairment^{44 46 100 102 109–111 121 123–130}	Autoantibodies, clinical, demographic, immune cells, 1H-MRS images, metabolites, MRI images, proteins	AdaBoost, ANN, CNN, decision tree, ELM, GLM, gradient descent, gradient tree boosting, GRU, hierarchical clustering, k-means, KNN, LASSO, LR, LSTM, Naïve Bayes, natural language, PCA, RF, RFE, ridge regression, RNN, SHAP, SVM, XGBoost, other
	Cutaneous^{47 73}	Cytokines, images, immune cells, genetics	CART, CNN, gradient descent, hierarchical clustering, LR, Naïve Bayes, RF, SMOTE, SVM, other
Disease activity	Renal flares^131–136	Autoantibodies, biopsy features, clinical characteristics, cytokines, demographics, genetics, renal ultrasonic radiomics, urinary biomarkers	ANN, CART, LASSO, LR, RF, RFE, SVM, XGBoost
	Extrarenal flares, SLEDAI score^{59 63 66 101 106 113 137–148}	Autoantibodies, biometrics data, clinical, demographics, EMR, genetics, immune cell subtypes by PBMC scRNA-seq, patient-reported outcomes, meteorological data, other omics, proteomics, quality of life	Adaptive boosting, ANN, Bayesian network, CART, consensus clustering, decision tree, ElasticNet, GLM, gradient tree boosting, hierarchical clustering, k-means, KNN, LDA, LGB, LR, multivariable ordinal regression, Naïve Bayes, NLP, PCA, ReliefF model, RF, ridge regression, SHAP, SMOTE, SVM, t-SNE, UMAP, XGBoost, other
	Treatment response^{56 69 74 77 78 113 131 132 134 139 142 149–152}	Autoantibodies, biopsy, clinical, cytokines, demographic, genetics, immune cells, urine	ANN, CART, consensus clustering, decision tree, GLM, hierarchical clustering, k-means, LASSO, LDA, LR, Naïve Bayes, PCA, PLS-DA, ReliefF, SMOTE, SVM, t-SNE, XGBoost
Disease complications	Organ damage⁴⁰	Autoantibodies, clinical, demographics, EMR	ANN, LR, RNN
	Atherosclerosis, cardiovascular events, arrhythmia, heart failure^{18 153–159 168}	Carotid intima thickness, clinical, demographics, EKG, EMR, genetics, lipids/metabolites	ANN, consensus clustering, decision tree, hierarchical clustering, KNN, LASSO, LDA, LR, multivariate adaptive regression spline, PCA, PLS-DA, RF, SMOTE, SVM
	Antiphospholipid syndrome/thrombosis^{21 53}	Autoantibodies, clinical, demographic, genetic	Hierarchical clustering, KNN, LASSO, LR, Naïve Bayes, RF
	Adverse pregnancy outcome^{147 160–162}	Autoantibodies, clinical, demographic, genetic, metabolites	ANN, decision trees, ElasticNet, gradient descent, hierarchical clustering, KNN, LASSO, LDA, LR, PCA, PLS-DA, RFE, RF, super learner, SVM
	Renal transplant¹⁶³	Autoantibodies, clinical, demographic, genetic, images	ANN, ‘bestfire’ feature selection, decision tree, ‘genetic search’ feature selection, LR
	Other: hypothyroidism, herpes, breast cancer, joint erosion^{83 164–167}	Clinical, demographic, immune cells, autoantibodies	ANN, decision tree, hierarchical clustering, KNN, LASSO, LR, RF, RFE, SVM, UMAP, XGBoost, other
Healthcare resource utilisation	Hospitalisation and costs^{17 20 41 42 142 169}	Administrative database, clinical, demographics, EMR	ANN, decision tree, ElasticNet, GRU, hierarchical clustering, k-means, KNN, LASSO, LR, LSTM, Naïve Bayes, natural language processing, RF, RFE, RNN, SHAP, SMOTE, XGBoost, other

ANN, artificial neural network; CART, classification and regression tree; CNN, convolutional neural network; ELM, extreme learning machine; EMR, electronic medical record; GLM, generalised linear model; GRU, gated recurrent units; 1H-MRS, proton magnetic resonance spectroscopy; KNN, k-nearest neighbour; LASSO, least absolute shrinkage and selection operator; LDA, linear discriminant analysis; LGB, light gradient-boosting machine; LR, logistic regression; LSTM, long short-term memory; NLP, natural language processing; PBMC, peripheral blood mononuclear cell; PCA, principal component analysis; PLS-DA, partial least squares discriminant analysis; RF, random forest; RFE, recursive feature elimination; RNN, recurrent neural network; scRNA-seq, single-cell RNA sequencing; SHAP, SHapley Additive exPlanations; SLEDAI, Systemic Lupus Erythematosus Disease Activity Index; SMOTE, Synthetic Minority Oversampling TEchnique; SVM, support vector machine; t-SNE, t-distributed stochastic neighbour embedding; UMAP, Uniform Manifold Approximation and Projection; XGBoost, extreme gradient boosting.

Pathogenesis

Among the reports that examined SLE pathogenesis, many used genetic and RNA-seq datasets. Novel markers identified by these reports include ST8SIA4,⁵⁷ CMTM4,⁵⁷ C2CD4B,⁵⁷ LCK,⁶⁹ cuproptosis-related genes,⁷² TNFSF13B,⁷⁹ OAS1,⁷⁹ ABCB1,⁸¹ CD247,⁸¹ DSC1,⁸¹ KIR2DL3⁸¹ and MX2.⁸¹ Immune-related biomarkers including autoantibodies, immune cell subtypes and cytokines were also analysed. These were often combined with other clinical features to reveal unique SLE endotypes via cluster analysis.^{62 63 70 74 75 80} Important immune pathways were identified including extrafollicular B cell involvement,⁵⁴ DNA methylation,⁶⁵ expansion of major helper T cell subsets and unique proliferating (Ki-67+) immune cell subsets,⁷⁶ and signalling lymphocytic activation molecule family receptors on peripheral blood mononuclear cells.⁶⁴

Diagnostic models

SLE diagnostic models were used to identify patients with SLE compared with healthy controls and from other autoimmune rheumatic diseases (eg, rheumatoid arthritis, Sjögren disease, systemic sclerosis, multiple sclerosis), Kikuchi disease and other forms of nephropathy for LN reports^{23 28 37 43 44 47 53 63 70 79 82–121} (AUC 0.70–0.99). A validated diagnostic algorithm called the SLE Risk Probability Index (SLERPI) was developed using LASSO-LR based on 14 SLE clinical and serological features.⁸⁴ A SLERPI score of greater than 7 was highly accurate (94.2%) and sensitive for detecting early disease (93.8%) and severe manifestations including kidney (97.9%) and neuropsychiatric involvement (91.8%). There were also specific diagnostic algorithms for LN,^{85 107 112 117 118 122} NPSLE^{44 46 100 102 109–111 121 123–130} and cutaneous LE.^{47 73} Cases of SLE^{28 98 114} and births from mothers with SLE⁸⁷ could also be derived from EMR using ML.

Reports using genomic and genetic expression datasets identified several important biomarker for LN including C1QA, C1QB, MX1, RORC, CD177, DEFA4, and HERC5 for LN.¹¹⁸ For non-renal SLE, FOXP3,⁸⁸ MX2,¹⁰⁶ HLA-DQA1,⁹⁰ HLA-DQB1,⁹⁰ HLA-DRB1,⁹⁰ neutrophil extracellular trap-related genes (HMGB1, ITGB2 and CREB5),⁷⁰ ABCB1,¹²⁰ IFI27¹²⁰ and PLSCR1¹²⁰ have been reported. Other types of biomarkers included proteomics (IFIT3, MX1, TOMM40, STAT1, STAT2 and OAS3),¹⁰¹ metabolomics,⁹¹ lipidomics⁹⁴ and microRNA profiles.¹⁰⁸

For the detection of LN, novel serum biomarkers as a form of ‘liquid biopsy’ included circulating cell-free methylated DNA.¹¹⁷ For NPSLE, different T and B cell subsets predicted depression in patients with SLE.^{127 128} Proteomics using cerebrospinal fluid demonstrated that CST6, L-selectin, Trappin-2, KLK5 and TCN2 could distinguish NPSLE from SLE controls (non-NPSLE).¹²⁴ Other reports using single-cell RNA sequencing data compared biomarkers for NPSLE to multiple sclerosis¹⁰³ and vascular dementia.⁸³ To differentiate cutaneous LE from other dermatological disorders such as psoriasis, eczema, atopic dermatitis and systemic sclerosis (RF model, AUC 0.774–0.990), interferon gene signature, tumour necrosis factor, interleukin-23 (IL-23), interferon (IFN), IL-12, and immune cell-related genetic signatures were selected as important biomarkers.⁷³

A variety of images were analysed using ML including brain MRI (functional MRI, cerebral perfusion, multivoxel proton magnetic resonance spectroscopy) for the detection of NPSLE,^{44 46 100 102 109–111 121 129 130} funduscopic images for lupus retinopathy²³ and clinical images of the skin for acute cutaneous LE, subcutaneous LE and discoid LE.⁴⁷

Disease activity and treatment response

For predicting renal flares,^{131–135 152} the best performing models contained both traditional clinical data and novel urine biomarkers, including cytokines, chemokines and/or markers of kidney damage. The best models for predicting renal flares from these studies included XGBoost and ANN (AUCs 0.70–0.94). Quantitative data extracted from renal ultrasound based on features such as texture, shape and wavelength could also detect LN activity.¹³³ Novel biomarkers for LN activity include renal IFI16¹³⁵ and V-set immunoglobulin domain-containing protein 4.¹³⁶

For extrarenal flares,^{59 63 66 101 106 113 137–148} approximately half of the reports used genetic or genetic expression datasets. Novel biomarkers predicting SLE flares or increased disease activity include MX2¹⁰⁶ and M1¹⁴³ gene expression and a nine-protein combination (PHACTR2, GOT2, L-selectin, CMC4, MAP2K1, CMPK2, ECPAS, SRA1 and STAT2).¹⁰¹ The AUC of the best models in these reports ranged from 0.70 to 0.99. One study also demonstrated that Systemic Lupus Erythematosus Disease Activity Index score can be estimated from unstructured clinical notes.¹³⁷

Treatment response was predicted with a high degree of accuracy in some reports with the outcome of renal flares being the most commonly evaluated.^{69 131 132 134 149 152} Clinical factors identified using feature importance ML models included C3, C4, age, race, sex, anti-dsDNA, baseline estimated glomerular filtration rate, urine protein-to-creatinine ratio as well as cytokine/protein factors such as CXCL8, pentraxin, adiponectin, MCP1, IL-8, IL-1a, IL-12, IL-6, IFNa2 and IFNy.^{131 152} The top performing predictive models for treatment response used a simple neural network (AUC 0.9735)¹³⁴ and an RF model (AUC 0.92).¹³¹ Predictors of disease remission (SVM, AUC 0.713)¹³⁹ and response to B cell therapies (RF, AUC 0.88)⁷⁷ were examined as well. Lastly, cluster analysis by k-means and consensus cluster to identify different SLE endotypes based on treatment response revealed a wide range of results, for example, the number of reported clusters ranged from 3 to 39.^{56 74 78 113 142 150} In our own study of 805 patients with SLE from the Systemic Lupus International Collaborating Clinics (SLICC) cohort, k-means clustering on PCA-transformed longitudinal autoantibody profiles over the first 5 years of disease revealed four distinct endotypes that were predictive of long-term disease activity, organ involvement, treatment requirements, and mortality risk.⁵⁶

Prognostic models

For the prediction of SLE outcomes, ML has been used to predict disease damage (RN, AUC 0.77).⁴⁰ Prediction of cardiovascular disease (atherosclerosis, cardiovascular events, arrhythmia and heart failure), a major cause of mortality in SLE, has been evaluated.^{153–159 168} Novel lipoprotein metabolites and deficiency in vitamin D were associated with atherosclerosis.^{153 157 158} Several candidate hub genes (SPI1, MMP9, C1QA, CX3CR1, MNDA) could predict the risk of atherosclerosis in SLE, and expression of CCR7, RNASE2, RNASE3 and CXCL10 genes for heart failure. The AUCs ranged from 0.81 to 0.98 for the various models.¹⁵⁴ A prediction score called SLE-venous thromboembolism (VTE) could predict VTE risk in patients with SLE (LR, AUC 0.808) based on 11 variables: sex, age, body mass index, hyperlipidaemia, hypoalbuminaemia, C reactive protein, anti-ß2-glycoprotein I antibodies, lupus anticoagulant, renal involvement, nervous system involvement and hydroxychloroquine use.²¹ A prediction model for 3-year allograft survival in kidney transplant recipients with SLE has also been developed (LR and ANN, AUC 0.73) using recipient age, race, maintenance regimen including prednisone, maintenance regimen, predominate renal replacement modality in the pretransplant period, and whether dialysis was required during the first post-transplant week.¹⁶³

Adverse pregnancy outcomes in patients with SLE were examined using different datasets. SLE activity was predicted in pregnant women (ElasticNet, AUC 0.978) using serum metabolites (glucose, alanine, acetoacetic acid and alpha-ketoisovalerate levels).¹⁴⁷ Potential genetic biomarkers identified with ML for predicting adverse pregnancy outcomes during early and mid-pregnancy in patients with SLE are SEZ6, NRAD1 and LPAR4.¹⁶⁰ There were also prediction models that used only routinely available clinical variables (eg, levels of alanine transaminase, alkaline phosphatase, lactate dehydrogenase, gamma-glutamyl transferase, erythrocytes, C3, C4, autoantibodies as well as maternal age, smoking status, hydroxychloroquine use and disease duration) (super learning, AUC 0.78; RF, AUC 0.917).^{161 162}

Other outcomes for patients with SLE included reduced risk of breast cancer with the presence of prognostic genetic biomarkers (ie, IRF7, IFI35 and EIF2AK2 gene expression) identified with LASSO.¹⁶⁵ Models for the prediction of joint erosions LR model (AUC 0.806),¹⁶⁴ herpes infection (RF, AUC 0.942)¹⁶⁷ and hypothyroidism (RF, AUC 0.772)¹⁶⁶ have also been developed using clinical and serological data. Among the selected features for these models, autoantibodies were found to be important predictors, for example, anti-carbamylated protein and anti-citrullinated protein antibodies for joint erosion¹⁶⁴ and anti-dsDNA and anti-SSB/La for hypothyroidism.¹⁶⁷ ML models showed promise in predicting the risk of hospitalisation and length of stay from EMR data (best performing models LSTM and XGBoost, AUC 0.88)^{20 41 42 142 169} and associated healthcare costs from administrative databases.^{17 142}

Future considerations

AI applications have become ubiquitous in medicine, and their impact on SLE care and research is no exception.¹⁷⁰ The range of AI applications and utilisation in SLE is expected to grow. Thus far, considerable work in SLE has been focused on developing ML models to predict disease, diagnosis and prognosis. Other applications of AI in SLE including drug discovery, clinical trial design and interpretation,¹⁷¹ diagnostic imaging analysis, personalised medicine and medical devices and technologies are just beginning. Increased availability and access to other types of data in the future will provide even more opportunities for SLE research. ML approaches in SLE may even make use of health data collected from mobile phones, wearable devices, social media and environmental datasets, which are becoming more popular in health research. Integration of more advanced ML methods in future reports will also allow for more efficient analysis of increasingly large and complex datasets. As discussed, there is already evidence of this trend with increased utilisation of deep learning and natural language processing approaches in SLE.

While AI facilitates discoveries that may improve patient outcomes and processes in the healthcare system, researchers should also be aware of the ethical, governance and regulatory considerations, including patient consent, confidentiality, transparency and privacy^{172 173} (figure 2). In 2019, the European League Against Rheumatism published recommendations that guide researchers on the collection, analysis, interpretation and implementation of big data through AI/ML.¹⁷⁴ While these are not discussed in detail in this review, we emphasise that these issues can arise at any step of the ML pipeline. For instance, during data collection, diverse data sources increasingly used by ML approaches (eg, EMR, administrative databases, social media, genetic or other multi-omics datasets, clinical trials and microbiome) are prone to potential sampling biases. These can exacerbate existing disparities in marginalised and underserved populations and violate the bioethical principles of justice. SLE is a disease that disproportionately affects racial and ethnic minorities and is therefore more sensitive to these issues (reviewed in ref ¹⁷⁵). The lack of representation by minority populations in clinical research, genetic reports¹⁷⁶ and clinical trials¹⁷⁷ is a real concern. More work is needed to study how we can address these issues, minimise harm and promote ethical ML models in the future.

Ethics statements

Patient consent for publication

Not applicable.

Footnote

Correction notice This article has been corrected since it was published. The provenance and peer review statement in the paper has been corrected.

Contributors All authors were involved in the concept and design, data analysis and interpretation, and editing for intellectual content. KZ, KAB, IYC, MJF and MYC were involved in manuscript drafting.

Funding Support for this study also came from the Lupus Foundation of America.

Competing interests MJF is a consultant to and has received honoraria and/or travel support from Werfen (Barcelona, Spain; San Diego, California). MJF is also Medical Director of Mitogen Diagnostics. MYC has received consulting fees from Celltrion, Mallinckrodt Pharmaceuticals, Werfen, Organon, AstraZeneca, and MitogenDx.

Provenance and peer review Commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

References

¹ Kingsmore KM, Puglisi CE, Grammer AC, et al. An introduction to machine learning and analysis of its use in rheumatic diseases. Nat Rev Rheumatol 2021; 17: 710–30. doi:10.1038/s41584-021-00708-w

² Hügle M, Omoumi P, van Laar JM, et al. Applied machine learning and artificial intelligence in rheumatology. Rheumatol Adv Pract 2020; 4: rkaa005. doi:10.1093/rap/rkaa005

³ Kim K-J, Tagkopoulos I. Application of machine learning in rheumatic disease research. Korean J Intern Med 2019; 34: 708–22.: 708. doi:10.3904/kjim.2018.349

⁴ Stoel B. Use of artificial intelligence in imaging in rheumatology–current status and future perspectives. RMD Open 2020; 6: e001063. doi:10.1136/rmdopen-2019-001063

⁵ Jiang M, Li Y, Jiang C, et al. Machine learning in rheumatic diseases. Clin Rev Allergy Immunol 2021; 60: 96–110. doi:10.1007/s12016-020-08805-6

⁶ Thorndike RL. Who belongs in the family Psychometrika 1953; 18: 267–76. doi:10.1007/BF02289263

⁷ Pelleg D, Moore AW. X-means: extending K-means with efficient estimation of the number of clusters. Icml 2000: 727–34.

⁸ Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap Statistic. J R Stat Soc Series B Stat Methodol 2001; 63: 411–23. doi:10.1111/1467-9868.00293

⁹ Li J, Cheng K, Wang S, et al. Feature selection: A data perspective. ACM Computing Surveys 2017; 50: 1–45. doi:10.1145/3136625

¹⁰ Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of Microarray data using random forest. BMC Bioinformatics 2006; 7: 1–13.: 3. doi:10.1186/1471-2105-7-3

¹¹ Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 1996; 58: 267–88. doi:10.1111/j.2517-6161.1996.tb02080.x Available: https://rss.onlinelibrary.wiley.com/toc/25176161/58/1

¹² Burges CJC. A Tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 1998; 2: 121–67. doi:10.1023/A:1009715923555

¹³ Stafford IS, Kellermann M, Mossotto E, et al. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020; 3: 30. doi:10.1038/s41746-020-0229-3

¹⁴ Munguía-Realpozo P, Etchegaray-Morales I, Mendoza-Pinto C, et al. Current state and completeness of reporting clinical prediction models using machine learning in systemic lupus erythematosus: a systematic review. Autoimmun Rev 2023; 22: 103294. doi:10.1016/j.autrev.2023.103294

¹⁵ Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement. Circulation 2015; 131: 211–9. doi:10.1161/CIRCULATIONAHA.114.014508

¹⁶ Choi MY, Ma C. Making a big impact with small Datasets using machine-learning approaches. Lancet Rheumatol 2020; 2: e451–2. doi:10.1016/S2665-9913(20)30217-4

¹⁷ Castro-Villarreal S, Beltran-Ostos A, Valencia CF. Estimation of prevalence and incremental costs of systemic lupus erythematosus in a middle-income country using machine learning on administrative health data. Value Health Reg Issues 2021; 26: 98–104. doi:10.1016/j.vhri.2021.04.005

¹⁸ Cohen IV, Makunts T, Moumedjian T, et al. Cardiac adverse events associated with chloroquine and hydroxychloroquine exposure in 20 years of drug safety surveillance reports. Sci Rep 2020; 10: 19199. doi:10.1038/s41598-020-76258-0

¹⁹ Grovu R, Huo Y, Nguyen A, et al. Machine learning: predicting hospital length of stay in patients admitted for lupus flares. Lupus 2023; 32: 1418–29. doi:10.1177/09612033231206830

²⁰ Reddy BK, Delen D. Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput Biol Med 2018; 101: 199–209. doi:10.1016/j.compbiomed.2018.08.029

²¹ You H, Zhao J, Zhang M, et al. Development and external validation of a prediction model for venous thromboembolism in systemic lupus erythematosus. RMD Open 2023; 9: e003568. doi:10.1136/rmdopen-2023-003568

²² Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Adv Neural Inf Process Syst 2014: 27.

²³ Liu R, Wang T, Li H, et al. TMM-nets: transferred multi- to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans Med Imaging 2023; 42: 1083–94. doi:10.1109/TMI.2022.3223683

²⁴ Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority over-sampling technique. Jair 2002; 16: 321–57. doi:10.1613/jair.953

²⁵ Azur MJ, Stuart EA, Frangakis C, et al. Multiple imputation by chained equations: what is it and how does it work Int J Methods Psychiatr Res 2011; 20: 40–9. doi:10.1002/mpr.329

²⁶ Chen T, Guestrin C. Xgboost: A Scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016: 785–94 doi:10.1145/2939672.2939785

²⁷ Moores KG, Sathe NA. A systematic review of validated methods for identifying systemic lupus erythematosus (SLE) using administrative or claims data. Vaccine 2013: K62–73. doi:10.1016/j.vaccine.2013.06.104

²⁸ Murray SG, Avati A, Schmajuk G, et al. Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. J Am Med Inform Assoc 2019; 26: 61–5. doi:10.1093/jamia/ocy154

²⁹ Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res 2017; 26: 796–808. doi:10.1177/0962280214558972

³⁰ Van der L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res 2008; 9: 11.

³¹ McInnes L, Healy J, Saul N, et al. n.d. Umap: uniform manifold approximation and projection for dimension reduction. JOSS; 3: 861. doi:10.21105/joss.00861

³² Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-Propagating errors. Nature 1986; 323: 533–6. doi:10.1038/323533a0

³³ Hart PE, Stork DG, Duda RO. Pattern classification. Wiley Hoboken, 2000.

³⁴ Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Statist 2001; 29: 1189–232. doi:10.1214/aos/1013203451

³⁵ Breiman L, Friedman JH, Olshen RA, et al. Classification and regression trees Belmont. CA: Wadsworth International Group, 1984.

³⁶ Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 2017: 30.

³⁷ Ma W, Lau YL, Yang W, et al. Random forests algorithm BOOSTS genetic risk prediction of systemic lupus erythematosus. Front Genet 2022; 13: 902793. doi:10.3389/fgene.2022.902793

³⁸ Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997; 9: 1735–80. doi:10.1162/neco.1997.9.8.1735

³⁹ Cho K, van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN Encoder–Decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP. Stroudsburg, PA, USA: Doha, Qatar, 2014. doi:10.3115/v1/D14-1179

⁴⁰ Ceccarelli F, Sciandrone M, Perricone C, et al. Prediction of chronic damage in systemic lupus erythematosus by using machine-learning models. PLoS One 2017; 12: e0174200. doi:10.1371/journal.pone.0174200

⁴¹ Zhao Y, Smith D, Jorge A. Comparing two machine learning approaches in predicting lupus hospitalization using longitudinal data. Sci Rep 2022; 12: 16424. doi:10.1038/s41598-022-20845-w

⁴² Jorge AM, Smith D, Wu Z, et al. Exploration of machine learning methods to predict systemic lupus erythematosus hospitalizations. Lupus 2022; 31: 1296–305. doi:10.1177/09612033221114805

⁴³ Yang C-K, Lee C-Y, Wang H-S, et al. Glomerular disease classification and lesion identification by machine learning. Biomed J 2022; 45: 675–85. doi:10.1016/j.bj.2021.08.011

⁴⁴ Yuan Y, Quan T, Song Y, et al. Noise-immune extreme ensemble learning for early diagnosis of neuropsychiatric systemic lupus erythematosus. IEEE J Biomed Health Inform 2022; 26: 3495–506. doi:10.1109/JBHI.2022.3164937

⁴⁵ Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep Convolutional neural networks. Commun ACM 2017; 60: 84–90. doi:10.1145/3065386

⁴⁶ Inglese F, Kim M, Steup-Beekman GM, et al. MRI-based classification of neuropsychiatric systemic lupus erythematosus patients with self-supervised Contrastive learning. Front Neurosci 2022; 16: 695888. doi:10.3389/fnins.2022.695888

⁴⁷ Wu H, Yin H, Chen H, et al. A deep learning-based Smartphone platform for cutaneous lupus erythematosus classification assistance: SIMPLIFYING the diagnosis of complicated diseases. Journal of the American Academy of Dermatology 2021; 85: 792–3. doi:10.1016/j.jaad.2021.02.043

⁴⁸ Kinloch AJ, Asano Y, Mohsin A, et al. Machine learning to quantify in situ humoral selection in human lupus tubulointerstitial inflammation. Front Immunol 2020; 11: 593177. doi:10.3389/fimmu.2020.593177

⁴⁹ Selvaraju RR, Cogswell M, Das A, et al. Grad-Cam: visual explanations from deep networks via gradient-based localization. 2017 IEEE International Conference on Computer Vision (ICCV) Venice 2017: 618–26. doi:10.1109/ICCV.2017.74

⁵⁰ Savage N. Breaking into the black box of artificial intelligence. Nature March 29, 2022. doi:10.1038/d41586-022-00858-1

⁵¹ Yu AC, Eng J. One algorithm may not fit all: how selection bias affects machine learning performance. Radiographics 2020; 40: 1932–7. doi:10.1148/rg.2020200040

⁵² Arabnejad M, Montgomery CG, Gaffney PM, et al. Nearest-neighbor projected distance regression for Epistasis detection in GWAS with population structure correction. Front Genet 2020; 11: 784. doi:10.3389/fgene.2020.00784

⁵³ Armañanzas R, Calvo B, Inza I, et al. Microarray analysis of autoimmune diseases by machine learning procedures. IEEE Trans Inf Technol Biomed 2009; 13: 341–50. doi:10.1109/TITB.2008.2011984

⁵⁴ Baxter RM, Wang CS, Garcia-Perez JE, et al. Expansion of Extrafollicular B and T cell Subsets in childhood-onset systemic lupus erythematosus. Front Immunol 2023; 14: 1208282. doi:10.3389/fimmu.2023.1208282

⁵⁵ Catalina MD, Bachali P, Yeo AE, et al. Patient ancestry significantly contributes to molecular heterogeneity of systemic lupus erythematosus. JCI Insight 2020; 5: 15.: e140380. doi:10.1172/jci.insight.140380

⁵⁶ Choi MY, Chen I, Clarke AE, et al. Machine learning identifies clusters of longitudinal autoantibody profiles predictive of systemic lupus erythematosus disease outcomes. Ann Rheum Dis 2023; 82: 927–36. doi:10.1136/ard-2022-223808

⁵⁷ Davis NA, Lareau CA, White BC, et al. Encore: genetic Association interaction network Centrality pipeline and application to SLE Exome data. Genet Epidemiol 2013; 37: 614–21. doi:10.1002/gepi.21739

⁵⁸ Devaprasad A, Radstake TRDJ, Pandit A. Integration of Immunome with disease-gene network reveals common cellular mechanisms between Imids and drug Repurposing strategies. Front Immunol 2021; 12: 669400. doi:10.3389/fimmu.2021.669400

⁵⁹ Falk I, Zhao M, Nait Saada J, et al. Learning the kernel for rare variant genetic Association test. Front Genet 2023; 14: 1245238. doi:10.3389/fgene.2023.1245238

⁶⁰ Figgett WA, Monaghan K, Ng M, et al. Machine learning applied to whole-blood RNA-sequencing data Uncovers distinct Subsets of patients with systemic lupus erythematosus. Clin Transl Immunology 2019; 8: e01093. doi:10.1002/cti2.1093

⁶¹ Foulquier N, Le Dantec C, Bettacchioli E, et al. Machine learning for the identification of a common signature for anti-SSA/Ro 60 antibody expression across autoimmune diseases. Arthritis Rheumatol 2022; 74: 1706–19. doi:10.1002/art.42243

⁶² Guthridge JM, Lu R, Tran LT-H, et al. Adults with systemic lupus exhibit distinct molecular phenotypes in a cross-sectional study. EClinicalMedicine 2020; 20: 100291. doi:10.1016/j.eclinm.2020.100291

⁶³ Hubbard EL, Bachali P, Kingsmore KM, et al. Analysis of Transcriptomic features reveals molecular Endotypes of SLE with clinical implications. Genome Med 2023; 15: 84. doi:10.1186/s13073-023-01237-9

⁶⁴ Humbel M, Bellanger F, Horisberger A, et al. SLAMF receptor expression identifies an immune signature that characterizes systemic lupus erythematosus. Front Immunol 2022; 13: 843059. doi:10.3389/fimmu.2022.843059

⁶⁵ Imgenberg-Kreuz J, Almlöf JC, Leonard D, et al. Shared and unique patterns of DNA methylation in systemic lupus erythematosus and primary Sjogren’s syndrome. Front Immunol 2019; 10: 1686. doi:10.3389/fimmu.2019.01686

⁶⁶ Kegerreis B, Catalina MD, Bachali P, et al. Machine learning approaches to predict lupus disease activity from gene expression data. Sci Rep 2019; 9: 9617. doi:10.1038/s41598-019-45989-0

⁶⁷ Kimura T, Ikeuchi H, Yoshino M, et al. Profiling of kidney involvement in systemic lupus erythematosus by deep learning using the National database of designated incurable diseases of Japan. Clin Exp Nephrol 2023; 27: 519–27. doi:10.1007/s10157-023-02337-x

⁶⁸ Le TT, Blackwood NO, Taroni JN, et al. Integrated machine learning pipeline for aberrant biomarker enrichment (I-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients. AMIA Annu Symp Proc 2018; 2018: 1358–67.

⁶⁹ Lee DJ, Tsai PH, Chen CC, et al. Incorporating knowledge of disease-defining Hub genes and regulatory network into a machine learning-based model for predicting treatment response in lupus nephritis after the first renal flare. J Transl Med 2023; 21: 76. doi:10.1186/s12967-023-03931-z

⁷⁰ Li H, Zhang X, Shang J, et al. Identification of nets-related biomarkers and molecular clusters in systemic lupus erythematosus. Front Immunol 2023; 14: 1150828. doi:10.3389/fimmu.2023.1150828

⁷¹ Li H, Zhou J, Zhou L, et al. Identification of the shared gene signatures and molecular pathways in systemic lupus erythematosus and diffuse large B-cell lymphoma. J Gene Med 2023; 25: e3558. doi:10.1002/jgm.3558

⁷² Li W, Guan X, Wang Y, et al. Cuproptosis-related gene identification and immune infiltration analysis in systemic lupus erythematosus. Front Immunol 2023; 14: 1157196. doi:10.3389/fimmu.2023.1157196

⁷³ Martínez BA, Shrotri S, Kingsmore KM, et al. Machine learning reveals distinct gene signature profiles in lesional and Nonlesional regions of inflammatory skin diseases. Sci Adv 2022; 8: eabn4776. doi:10.1126/sciadv.abn4776

⁷⁴ Qiao J, Zhang S-X, Chang M-J, et al. Deep stratification by Transcriptome molecular characters for precision treatment of patients with systemic lupus erythematosus. Rheumatology (Oxford) 2023; 62: 2574–84. doi:10.1093/rheumatology/keac625

⁷⁵ Robinson GA, Peng J, Dönnes P, et al. Disease-associated and patient-specific immune cell signatures in juvenile-onset systemic lupus erythematosus: patient stratification using a machine-learning approach. Lancet Rheumatol 2020; 2: e485–96. doi:10.1016/S2665-9913(20)30168-5

⁷⁶ Sasaki T, Bracero S, Keegan J, et al. Longitudinal immune cell profiling in patients with early systemic lupus erythematosus. Arthritis Rheumatol 2022; 74: 1808–21. doi:10.1002/art.42248

⁷⁷ Shipa M, Santos LR, Nguyen DX, et al. Identification of biomarkers to stratify response to B-cell-targeted therapies in systemic lupus erythematosus: an exploratory analysis of a randomised controlled trial. Lancet Rheumatol 2023; 5: e24–35. doi:10.1016/S2665-9913(22)00332-0

⁷⁸ Toro-Domínguez D, Lopez-Domínguez R, García Moreno A, et al. Differential treatments based on drug-induced gene expression signatures and longitudinal systemic lupus erythematosus stratification. Sci Rep 2019; 9: 15502. doi:10.1038/s41598-019-51616-9

⁷⁹ Wang Y, Huang Z, Xiao Y, et al. The shared biomarkers and pathways of systemic lupus erythematosus and metabolic syndrome analyzed by Bioinformatics combining machine learning algorithm and single-cell sequencing analysis. Front Immunol 2022; 13: 1015882. doi:10.3389/fimmu.2022.1015882

⁸⁰ Yones SA, Annett A, Stoll P, et al. Interpretable machine learning identifies Paediatric systemic lupus erythematosus subtypes based on gene expression data. Sci Rep 2022; 12: 7433. doi:10.1038/s41598-022-10853-1

⁸¹ Zhao X, Duan L, Cui D, et al. Exploration of biomarkers for systemic lupus erythematosus by machine-learning analysis. BMC Immunol 2023; 24: 44. doi:10.1186/s12865-023-00581-0

⁸² Du J, Huang H, Pang L, et al. A machine learning model for identifying systemic lupus erythematosus through laboratory information system and electronic medical record. Clin Exp Rheumatol November 15, 2023. doi:10.55563/clinexprheumatol/jvdrpc

⁸³ Chen J, Zhao X, Huang C, et al. Novel insights into molecular signatures and pathogenic cell populations shared by systemic lupus erythematosus and vascular dementia. Funct Integr Genomics 2023; 23: 337. doi:10.1007/s10142-023-01270-2

⁸⁴ Adamichou C, Genitsaridi I, Nikolopoulos D, et al. Lupus or not? SLE risk probability index (SLERPI): a simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus. Ann Rheum Dis 2021; 80: 758–66. doi:10.1136/annrheumdis-2020-219069

⁸⁵ Agar JW, Webb GI. Application of machine learning to a renal biopsy database. Nephrol Dial Transplant 1992; 7: 472–8.

⁸⁶ AlShareedah A, Zidoum H, Al-Sawafi S, et al. Machine learning approach for predicting systemic lupus erythematosus in an Oman-based cohort. Sultan Qaboos Univ Med J 2023; 23: 328–35. doi:10.18295/squmj.12.2022.069

⁸⁷ Barnado A, Eudy AM, Blaske A, et al. Developing and validating methods to assemble systemic lupus erythematosus births in the electronic health record. Arthritis Care Res (Hoboken) 2022; 74: 849–57. doi:10.1002/acr.24522

⁸⁸ Birjan Z, Khashei Varnamkhasti K, Parhoudeh S, et al. Crucial role of Foxp(3) gene expression and Mutation in systemic lupus erythematosus inferred from computational and experimental approaches. Diagnostics 2023; 13: 22.: 3442. doi:10.3390/diagnostics13223442

⁸⁹ Ceccarelli F, Lapucci M, Olivieri G, et al. Can machine learning models support physicians in systemic lupus erythematosus diagnosis? results from a Monocentric cohort. Joint Bone Spine 2022; 89: 105292. doi:10.1016/j.jbspin.2021.105292

⁹⁰ Chung C-W, Hsiao T-H, Huang C-J, et al. Machine learning approaches for the Genomic prediction of rheumatoid arthritis and systemic lupus erythematosus. BioData Min 2021; 14: 52. doi:10.1186/s13040-021-00284-5

⁹¹ Du Q, Wang X, Chen J, et al. Machine learning Encodes urine and serum metabolic patterns for autoimmune disease discrimination, classification and metabolic dysregulation analysis. Analyst 2023; 148: 4318–30. doi:10.1039/d3an01051a

⁹² Guy RT, Santago P, Langefeld CD. Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease. Genet Epidemiol 2012; 36: 99–106. doi:10.1002/gepi.21608

⁹³ Han Y, Jin Z, Ma L, et al. Development of clinical decision models for the prediction of systemic lupus erythematosus and Sjogren’s syndrome overlap. J Clin Med 2023; 12: 535. doi:10.3390/jcm12020535

⁹⁴ He J, Ma C, Tang D, et al. Absolute Quantification and characterization of Oxylipins in lupus nephritis and systemic lupus erythematosus. Front Immunol 2022; 13: 964901. doi:10.3389/fimmu.2022.964901

⁹⁵ Huang Z, Shi Y, Cai B, et al. MALDI-TOF MS combined with magnetic beads for detecting serum protein biomarkers and establishment of boosting decision tree model for diagnosis of systemic lupus erythematosus. Rheumatology (Oxford) 2009; 48: 626–31. doi:10.1093/rheumatology/kep058

⁹⁶ Huang Z, Shi Y, Cai B, et al. Promising diagnostic model for systemic lupus erythematosus using Proteomic fingerprint technology. Sichuan Da Xue Xue Bao Yi Xue Ban 2009; 40: 499–503.

⁹⁷ Jiang Z, Shao M, Dai X, et al. Identification of diagnostic biomarkers in systemic lupus erythematosus based on Bioinformatics analysis and machine learning. Front Genet 2022; 13: 865559. doi:10.3389/fgene.2022.865559

⁹⁸ Jorge A, Castro VM, Barnado A, et al. Identifying lupus patients in electronic health records: development and validation of machine learning Algorithms and application of rule-based Algorithms. Semin Arthritis Rheum 2019; 49: 84–90. doi:10.1016/j.semarthrit.2019.01.002

⁹⁹ Leventhal EL, Daamen AR, Grammer AC, et al. An interpretable machine learning pipeline based on Transcriptomics predicts phenotypes of lupus patients. iScience 2023; 26: 108042. doi:10.1016/j.isci.2023.108042

¹⁰⁰ Li Y, Ge Z, Zhang Z, et al. Broad learning enhanced (1)H-MRS for early diagnosis of neuropsychiatric systemic lupus erythematosus. Computational and Mathematical Methods in Medicine 2020; 2020: 1–13. doi:10.1155/2020/8874521

¹⁰¹ Li Y, Ma C, Liao S, et al. Combined Proteomics and single cell RNA-sequencing analysis to identify biomarkers of disease diagnosis and disease exacerbation for systemic lupus erythematosus. Front Immunol 2022; 13: 969509. doi:10.3389/fimmu.2022.969509

¹⁰² Luo X, Piao S, Li H, et al. Multi-lesion Radiomics model for discrimination of relapsing-remitting multiple sclerosis and neuropsychiatric systemic lupus erythematosus. Eur Radiol 2022; 32: 5700–10. doi:10.1007/s00330-022-08653-2

¹⁰³ Ma Y, Chen J, Wang T, et al. Accurate machine learning model to diagnose chronic autoimmune diseases utilizing information from B cells and monocytes. Front Immunol 2022; 13: 870531. doi:10.3389/fimmu.2022.870531

¹⁰⁴ Martin-Gutierrez L, Peng J, Thompson NL, et al. Stratification of patients with Sjogren’s syndrome and patients with systemic lupus erythematosus according to two shared immune cell signatures, with potential therapeutic implications. Arthritis Rheumatol 2021; 73: 1626–37. doi:10.1002/art.41708

¹⁰⁵ Martorell-Marugán J, Chierici M, Jurman G, et al. Differential diagnosis of systemic lupus erythematosus and Sjogren’s syndrome using machine learning and multi-Omics data. Comput Biol Med 2023; 152: 106373. doi:10.1016/j.compbiomed.2022.106373

¹⁰⁶ Meng XW, Cheng ZL, Lu ZY, et al. Mx2: identification and systematic mechanistic analysis of a novel immune-related biomarker for systemic lupus erythematosus. Front Immunol 2022; 13: 978851. doi:10.3389/fimmu.2022.978851

¹⁰⁷ Mondal S, Singh MP, Kumar A, et al. Rapid molecular evaluation of human kidney tissue sections by in situ mass Spectrometry and machine learning to classify the nephrotic syndrome. J Proteome Res 2023; 22: 967–76. doi:10.1021/acs.jproteome.2c00768

¹⁰⁸ Ormseth MJ, Solus JF, Sheng Q, et al. Development and validation of a Microrna panel to differentiate between patients with rheumatoid arthritis or systemic lupus erythematosus and controls. J Rheumatol 2020; 47: 188–96. doi:10.3899/jrheum.181029

¹⁰⁹ Scully M, Anderson B, Lane T, et al. An automated method for Segmenting white matter lesions through multi-level morphometric feature classification with application to lupus. Front Hum Neurosci 2010; 4: 27. doi:10.3389/fnhum.2010.00027

¹¹⁰ Simos NJ, Dimitriadis SI, Kavroulakis E, et al. Quantitative identification of functional Connectivity disturbances in neuropsychiatric lupus based on resting-state fMRI: A robust machine learning approach. Brain Sci 2020; 10: 11.: 777. doi:10.3390/brainsci10110777

¹¹¹ Tan G, Huang B, Cui Z, et al. A noise-immune reinforcement learning method for early diagnosis of neuropsychiatric systemic lupus erythematosus. Math Biosci Eng 2022; 19: 2219–39. doi:10.3934/mbe.2022104

¹¹² Tang Y, Zhang W, Zhu M, et al. Lupus nephritis pathology prediction with clinical indices. Sci Rep 2018; 8: 10231. doi:10.1038/s41598-018-28611-7

¹¹³ Toro-Domínguez D, Martorell-Marugán J, Martinez-Bueno M, et al. Scoring personalized molecular portraits identify systemic lupus erythematosus subtypes and predict individualized drug responses, Symptomatology and disease progression. Brief Bioinform 2022; 23: bbac332. doi:10.1093/bib/bbac332

¹¹⁴ Turner CA, Jacobs AD, Marques CK, et al. Word2Vec inversion and traditional text classifiers for Phenotyping lupus. BMC Med Inform Decis Mak 2017; 17: 126. doi:10.1186/s12911-017-0518-1

¹¹⁵ Usategui I, Barbado J, Torres AM, et al. Machine learning, a new tool for the detection of immunodeficiency patterns in systemic lupus erythematosus. J Investig Med 2023; 71: 742–52. doi:10.1177/10815589231171404

¹¹⁶ Wang D-C, Xu W-D, Wang S-N, et al. Lupus nephritis or not? A simple and clinically friendly machine learning pipeline to help diagnosis of lupus nephritis. Inflamm Res 2023; 72: 1315–24. doi:10.1007/s00011-023-01755-7

¹¹⁷ Wang F, Miao H, Pei Z, et al. Serological, Fragmentomic, and epigenetic characteristics of cell-free DNA in patients with lupus nephritis. Front Immunol 2022; 13: 1001690. doi:10.3389/fimmu.2022.1001690

¹¹⁸ Wang L, Yang Z, Yu H, et al. Predicting diagnostic gene expression profiles associated with immune infiltration in patients with lupus nephritis. Front Immunol 2022; 13: 839197. doi:10.3389/fimmu.2022.839197

¹¹⁹ Yu S-C, Chang K-C, Wang H, et al. Distinguishing lupus Lymphadenitis from Kikuchi disease based on Clinicopathological features and C4D immunohistochemistry. Rheumatology (Oxford) 2021; 60: 1543–52. doi:10.1093/rheumatology/keaa524

¹²⁰ Zhong Y, Zhang W, Hong X, et al. Screening biomarkers for systemic lupus erythematosus based on machine learning and exploring their expression correlations with the ratios of various immune cells. Front Immunol 2022; 13: 873787. doi:10.3389/fimmu.2022.873787

¹²¹ Zhuo Z, Su L, Duan Y, et al. Different patterns of cerebral perfusion in SLE patients with and without neuropsychiatric manifestations. Hum Brain Mapp 2020; 41: 755–66. doi:10.1002/hbm.24837

¹²² Wang M, Liang Y, Hu Z, et al. Lupus nephritis diagnosis using enhanced moth flame algorithm with support vector machines. Computers in Biology and Medicine 2022; 145: 105435. doi:10.1016/j.compbiomed.2022.105435

¹²³ Heming M, Müller-Miny L, Rolfes L, et al. Supporting the differential diagnosis of connective tissue diseases with neurological involvement by blood and cerebrospinal fluid flow Cytometry. J Neuroinflammation 2023; 20: 46. doi:10.1186/s12974-023-02733-w

¹²⁴ Ni J, Chen C, Wang S, et al. Novel CSF biomarkers for diagnosis and integrated analysis of neuropsychiatric systemic lupus erythematosus: based on antibody profiling. Arthritis Res Ther 2023; 25: 165. doi:10.1186/s13075-023-03146-z

¹²⁵ Raghunath S, Glikmann-Johnston Y, Vincent FB, et al. Patterns and prevalence of cognitive dysfunction in systemic lupus erythematosus. J Int Neuropsychol Soc 2023; 29: 421–30. doi:10.1017/S1355617722000418

¹²⁶ Barraclough M, Erdman L, Diaz-Martinez JP, et al. Systemic lupus erythematosus phenotypes formed from machine learning with a specific focus on cognitive impairment. Rheumatology (Oxford) 2023; 62: 3610–8. doi:10.1093/rheumatology/keac653

¹²⁷ Dong C, Yang N, Zhao R, et al. SVM-based model combining patients' reported outcomes and lymphocyte phenotypes of depression in systemic lupus erythematosus. Biomolecules 2023; 13: 723. doi:10.3390/biom13050723

¹²⁸ Gu X-X, Jin Y, Fu T, et al. Relevant characteristics analysis using natural language processing and machine learning based on phenotypes and T-cell Subsets in systemic lupus erythematosus patients with anxiety. Front Psychiatry 2021; 12: 793505. doi:10.3389/fpsyt.2021.793505

¹²⁹ Tay SH, Stephenson MC, Allameen NA, et al. Combining Multimodal magnetic resonance brain imaging and machine learning to unravel Neurocognitive function in non-neuropsychiatric systemic lupus erythematosus. Rheumatology 2024; 63: 414–22. doi:10.1093/rheumatology/kead221

¹³⁰ Rumetshofer T, Inglese F, de Bresser J, et al. Tract-based white matter Hyperintensity patterns in patients with systemic lupus erythematosus using an Unsupervised machine learning approach. Sci Rep 2022; 12: 21376. doi:10.1038/s41598-022-25990-w

¹³¹ Ayoub I, Wolf BJ, Geng L, et al. Prediction models of treatment response in lupus nephritis. Kidney Int 2022; 101: 379–89. doi:10.1016/j.kint.2021.11.014

¹³² Chen Y, Huang S, Chen T, et al. Machine learning for prediction and risk stratification of lupus nephritis renal flare. Am J Nephrol 2021; 52: 152–60. doi:10.1159/000513566

¹³³ Qin X, Xia L, Zhu C, et al. Noninvasive evaluation of lupus nephritis activity using a Radiomics machine learning model based on ultrasound. J Inflamm Res 2023; 16: 433–41. doi:10.2147/JIR.S398399

¹³⁴ Stojanowski J, Konieczny A, Rydzyńska K, et al. Artificial neural network - an effective tool for predicting the lupus nephritis outcome. BMC Nephrol 2022; 23: 381. doi:10.1186/s12882-022-02978-2

¹³⁵ Wang X, Fu S, Yu J, et al. Renal interferon-inducible protein 16 expression is associated with disease activity and prognosis in lupus nephritis. Arthritis Res Ther 2023; 25: 112. doi:10.1186/s13075-023-03094-8

¹³⁶ Tang C, Zhang S, Teymur A, et al. V-set immunoglobulin domain-containing protein 4 as a novel serum biomarker of lupus nephritis and renal pathology activity. Arthritis Rheumatol 2023; 75: 1573–85. doi:10.1002/art.42545

¹³⁷ Alves P, Bandaria J, Leavy MB, et al. Validation of a machine learning approach to estimate systemic lupus erythematosus disease activity index score categories and application in a real-world Dataset. RMD Open 2021; 7: e001586. doi:10.1136/rmdopen-2021-001586

¹³⁸ Andreoletti G, Lanata CM, Trupin L, et al. Transcriptomic analysis of immune cells in a multi-ethnic cohort of systemic lupus erythematosus patients identifies Ethnicity- and disease-specific expression signatures. Commun Biol 2021; 4: 488. doi:10.1038/s42003-021-02000-9

¹³⁹ Ceccarelli F, Olivieri G, Sortino A, et al. Comprehensive disease control in systemic lupus erythematosus. Semin Arthritis Rheum 2021; 51: 404–8. doi:10.1016/j.semarthrit.2021.02.005

¹⁴⁰ Han J, Zhou Z, Zhang R, et al. Fucosylation of anti-dsDNA Igg1 correlates with disease activity of treatment-naive systemic lupus erythematosus patients. EBioMedicine 2022; 77: 103883. doi:10.1016/j.ebiom.2022.103883

¹⁴¹ Jupe ER, Lushington GH, Purushothaman M, et al. Tracking of systemic lupus erythematosus (SLE) Longitudinally using Biosensor and patient-reported data: A report on the fully decentralized mobile study to measure and predict lupus disease activity using Digital signals-the OASIS study. BioTech 2023; 12: 62. doi:10.3390/biotech12040062

¹⁴² Kan H, Nagar S, Patel J, et al. Longitudinal treatment patterns and associated outcomes in patients with newly diagnosed systemic lupus erythematosus. Clin Ther 2016; 38: 610–24. doi:10.1016/j.clinthera.2016.01.016

¹⁴³ Labonte AC, Kegerreis B, Geraci NS, et al. Identification of alterations in macrophage activation associated with disease activity in systemic lupus erythematosus. PLoS One 2018; 13: e0208132. doi:10.1371/journal.pone.0208132

¹⁴⁴ Maffi M, Tani C, Cascarano G, et al. “Which extra-renal flare is "difficult to treat" in systemic lupus erythematosus? A one-year longitudinal study comparing traditional and machine learning approaches”. Rheumatology (Oxford) 2024; 63: 376–84. doi:10.1093/rheumatology/kead166

¹⁴⁵ Rector I, Owen KA, Bachali P, et al. Differential regulation of the interferon response in systemic lupus erythematosus distinguishes patients of Asian ancestry. RMD Open 2023; 9: e003475. doi:10.1136/rmdopen-2023-003475

¹⁴⁶ Wang D-C, Xu W-D, Qin Z, et al. Systemic lupus erythematosus with high disease activity identification based on machine learning. Inflamm Res 2023; 72: 1909–18. doi:10.1007/s00011-023-01793-1

¹⁴⁷ Wang Y, Shu W, Lin S, et al. Hollow cobalt oxide/carbon hybrids aid metabolic Encoding for active systemic lupus erythematosus during pregnancy. Small 2022; 18: e2106412. doi:10.1002/smll.202106412 Available: https://onlinelibrary.wiley.com/toc/16136829/18/11

¹⁴⁸ Hoi A, Nim HT, Koelmeyer R, et al. Algorithm for calculating high disease activity in SLE. Rheumatology 2021; 60: 4291–7. doi:10.1093/rheumatology/keab003

¹⁴⁹ Helget LN, Dillon DJ, Wolf B, et al. Development of a lupus nephritis suboptimal response prediction tool using renal histopathological and clinical laboratory variables at the time of diagnosis. Lupus Sci Med 2021; 8: e000489. doi:10.1136/lupus-2021-000489

¹⁵⁰ Maeda S, Hashimoto H, Maeda T, et al. High-dimensional analysis of T-cell profiling variations following Belimumab treatment in systemic lupus erythematosus. Lupus Sci Med 2023; 10: e000976. doi:10.1136/lupus-2023-000976

¹⁵¹ Wang DD, Li YF, Zhang C, et al. Predicting the effect of sirolimus on disease activity in patients with systemic lupus erythematosus using machine learning. J Clin Pharm Ther 2022; 47: 1845–50. doi:10.1111/jcpt.13778

¹⁵² Wolf BJ, Spainhour JC, Arthur JM, et al. Development of biomarker models to predict outcomes in lupus nephritis. Arthritis & Rheumatology 2016; 68: 1955–63. doi:10.1002/art.39623 Available: https://acrjournals.onlinelibrary.wiley.com/toc/23265205/68/8

¹⁵³ Coelewij L, Waddington KE, Robinson GA, et al. Serum Metabolomic signatures can predict Subclinical Atherosclerosis in patients with systemic lupus erythematosus. Arterioscler Thromb Vasc Biol 2021; 41: 1446–58. doi:10.1161/ATVBAHA.120.315321

¹⁵⁴ Liu C, Zhou Y, Zhou Y, et al. Identification of crucial genes for predicting the risk of Atherosclerosis with system lupus erythematosus based on comprehensive Bioinformatics analysis and machine learning. Comput Biol Med 2023; 152: 106388. doi:10.1016/j.compbiomed.2022.106388

¹⁵⁵ Luo Z, Lu G, Yang Q, et al. Identification of shared immune cells and immune-related Co-disease genes in chronic heart failure and systemic lupus erythematosus based on Transcriptome sequencing

¹⁵⁶ Matthiesen R, Lauber C, Sampaio JL, et al. Shotgun mass Spectrometry-based lipid profiling identifies and distinguishes between chronic inflammatory diseases. EBioMedicine 2021; 70: 103504. doi:10.1016/j.ebiom.2021.103504

¹⁵⁷ Peng J, Dönnes P, Ardoin SP, et al. Atherosclerosis progression in the APPLE trial can be predicted in young people with juvenile-onset systemic lupus erythematosus using a novel lipid Metabolomic signature. Arthritis & Rheumatology 2023. doi:10.1002/art.42722

¹⁵⁸ Ravenell RL, Kamen DL, Fleury TJ, et al. Premature Atherosclerosis is associated with Hypovitaminosis D and angiotensin-converting enzyme inhibitor non-use in lupus patients. The American Journal of the Medical Sciences 2012; 344: 268–73. doi:10.1097/MAJ.0b013e31823fa7d9

¹⁵⁹ Hu Z, Wu L, Lin Z, et al. Prevalence and associated factors of electrocardiogram abnormalities in patients with systemic lupus erythematosus: A machine learning study. Arthritis Care & Research 2022; 74: 1640–8. doi:10.1002/acr.24612 Available: https://acrjournals.onlinelibrary.wiley.com/toc/21514658/74/10

¹⁶⁰ Deng Y, Zhou Y, Shi J, et al. Potential genetic biomarkers predict adverse pregnancy outcome during early and mid-pregnancy in women with systemic lupus erythematosus. Front Endocrinol 2022; 13: 957010. doi:10.3389/fendo.2022.957010

¹⁶¹ Fazzari MJ, Guerra MM, Salmon J, et al. Adverse pregnancy outcomes in women with systemic lupus erythematosus: can we improve predictions with machine learning Lupus Sci Med 2022; 9: e000769. doi:10.1136/lupus-2022-000769

¹⁶² Hao X, Zheng D, Khan M, et al. Machine learning models for predicting adverse pregnancy outcomes in pregnant women with systemic lupus erythematosus. Diagnostics (Basel) 2023; 13: 612. doi:10.3390/diagnostics13040612

¹⁶³ Tang H, Poynton MR, Hurdle JF, et al. Predicting three-year kidney graft survival in recipients with systemic lupus erythematosus. ASAIO J 2011; 57: 300–9. doi:10.1097/MAT.0b013e318222db30

¹⁶⁴ Ceccarelli F, Sciandrone M, Perricone C, et al. Biomarkers of erosive arthritis in systemic lupus erythematosus: application of machine learning models. PLoS One 2018; 13: e0207926. doi:10.1371/journal.pone.0207926

¹⁶⁵ Liang X, Peng Z, Lin Z, et al. Identification of Prognostic genes for breast cancer related to systemic lupus erythematosus by integrated analysis and machine learning. Immunobiology 2023; 228. doi:10.1016/j.imbio.2023.152730

¹⁶⁶ Huang T, Liu S, Huang J, et al. Prediction and associated factors of hypothyroidism in systemic lupus erythematosus: a cross-sectional study based on multiple machine learning Algorithms. Curr Med Res Opin 2022; 38: 229–35. doi:10.1080/03007995.2021.2015156

¹⁶⁷ Wang DC, Tang YY, He CS, et al. Exploring machine learning methods for predicting systemic lupus erythematosus with herpes. Int J Rheum Dis 2023; 26: 2047–54. doi:10.1111/1756-185X.14869

¹⁶⁸ Robinson GA, Peng J, Pineda-Torra I, et al. Metabolomics defines complex patterns of Dyslipidaemia in juvenile-SLE patients associated with inflammation and potential cardiovascular disease risk. Metabolites 2021; 12: 3. doi:10.3390/metabo12010003

¹⁶⁹ Grovu R, Huo Y, Nguyen A, et al. Machine learning: predicting hospital length of stay in patients admitted for lupus flares. Lupus 2023; 32: 1418–29.: 09612033231206830. doi:10.1177/09612033231206830

¹⁷⁰ Beam AL, Drazen JM, Kohane IS, et al. Artificial intelligence in medicine. N Engl J Med 2023; 388: 1220. doi:10.1056/NEJMe2206291

¹⁷¹ Kingsmore KM, Lipsky PE. Recent advances in the use of machine learning and artificial intelligence to improve diagnosis, predict flares, and enrich clinical trials in lupus. Curr Opin Rheumatol 2022; 34: 374–81. doi:10.1097/BOR.0000000000000902

¹⁷² Chen IY, Pierson E, Rose S, et al. Ethical machine learning in Healthcare. Annu Rev Biomed Data Sci 2021; 4: 123–44. doi:10.1146/annurev-biodatasci-092820-114757

¹⁷³ Kaye J. The tension between data sharing and the protection of privacy in Genomics research. Annu Rev Genomics Hum Genet 2012; 13: 415–31. doi:10.1146/annurev-genom-082410-101454

¹⁷⁴ Gossec L, Kedra J, Servy H, et al. EULAR points to consider for the use of big data in rheumatic and musculoskeletal diseases. Ann Rheum Dis 2020; 79: 69–76. doi:10.1136/annrheumdis-2019-215694

¹⁷⁵ Buie J, McMillan E, Kirby J, et al. Disparities in lupus and the role of social determinants of health: Current state of knowledge and directions for future research. ACR Open Rheumatol 2023; 5: 454–64. doi:10.1002/acr2.11590

¹⁷⁶ Scofield RH, Sharma R, Aberle T, et al. Impact of race and Ethnicity on family participation in systemic lupus erythematosus genetic studies. Front Lupus 2023; 1: 1100534. doi:10.3389/flupu.2023.1100534

¹⁷⁷ Sheikh SZ, Wanty NI, Stephens J, et al. The state of lupus clinical trials: minority participation needed. J Clin Med 2019; 8: 1245. doi:10.3390/jcm8081245

Word count: 11229

Show less

© 2024 Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/ This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ . Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Artificial intelligence and machine learning applications are emerging as transformative technologies in medicine. With greater access to a diverse range of big datasets, researchers are turning to these powerful techniques for data analysis. Machine learning can reveal patterns and interactions between variables in large and complex datasets more accurately and efficiently than traditional statistical methods. Machine learning approaches open new possibilities for studying SLE, a multifactorial, highly heterogeneous and complex disease. Here, we discuss how machine learning methods are rapidly being integrated into the field of SLE research. Recent reports have focused on building prediction models and/or identifying novel biomarkers using both supervised and unsupervised techniques for understanding disease pathogenesis, early diagnosis and prognosis of disease. In this review, we will provide an overview of machine learning techniques to discuss current gaps, challenges and opportunities for SLE studies. External validation of most prediction models is still needed before clinical adoption. Utilisation of deep learning models, access to alternative sources of health data and increased awareness of the ethics, governance and regulations surrounding the use of artificial intelligence in medicine will help propel this exciting field forward.

Details

Title

Systemic lupus in the era of machine learning medicine

Author

Zhan, Kevin¹; Buhler, Katherine A¹; Chen, Irene Y²; Fritzler, Marvin J¹; Choi, May Y³

¹ University of Calgary Cumming School of Medicine, Calgary, Alberta, Canada
² Computational Precision Health, University of California Berkeley and University of California San Francisco, Berkeley, California, USA; Electrical Engineering and Computer Science, University of California Berkeley, Berkeley, California, USA
³ University of Calgary Cumming School of Medicine, Calgary, Alberta, Canada; McCaig Institute for Bone and Joint Health, Calgary, Alberta, Canada

First page

e001140

Section

Review

Publication year

2024

Publication date

2024

Publisher

BMJ Publishing Group LTD

e-ISSN

20538790

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1136/lupus-2023-001140

ProQuest document ID

2937127047

Systemic lupus in the era of machine learning medicine

Jump to:

Full text

Abstract

Details

Suggested sources