Content area

Abstract

Background

Checkpoint inhibitor pneumonitis (CIP) is among the most lethal immune-related adverse events in patients with cancer receiving immunotherapy. This study aims to characterize the lung microbiome in patients with CIP and evaluate its diagnostic potential.

Methods

In a prospective clinical trial (NCT06192303), bronchoalveolar lavage fluid samples (BALF) were obtained from 38 patients presenting clinical symptoms and radiographic evidence of pneumonitis following immunotherapy. The cohort included 14 cases of pure-type CIP (PT-CIP), 14 cases of mixed-type CIP, and 10 cases of pulmonary infection (PI). Metagenomic next-generation sequencing (mNGS) of BALF was employed to delineate the lung microbiota profiles. Using linear discriminant analysis effect size, we discerned characteristic microbiota among the three groups and further explored the associations of signature microbiota with host immune-inflammatory markers. Functional enrichment analysis revealed potential metabolic reprogramming and differences in biological functions between patients with CIP and PI. Finally, leveraging four machine-learning models, we ascertained the clinical value of BALF microbiota profiles in diagnosing CIP.

Results

The composition of lung microbiota differed significantly between patients with CIP and PI. Microbial taxa, such as Porphyromonas, Candida, Peptostreptococcus, Treponema, and Talaromyces, exhibited distinct abundance patterns across the three groups. Correlation analysis revealed a significant positive relationship between Candida abundance and host immune-inflammatory markers, such as neutrophil-lymphocyte ratio, platelet-lymphocyte ratio, monocyte-lymphocyte ratio, and systemic immune inflammation index. In contrast, Porphyromonas demonstrated a significant negative correlation. Compared with the patients with PT-CIP, the lung microbiota of patients with PI exhibited a more diverse biological and metabolic profile. Additionally, machine learning models based on BALF microbiota profiles could accurately diagnose CIP, with the decision tree model showing the best diagnostic performance (area under the curve: 0.88).

Conclusions

Our study represents the unique characterization of the lung microbiota profiles across distinct CIP subtypes and establishes a diagnostic model for CIP based on the decision tree. These findings emphasize the value of BALF mNGS in improving the diagnosis of CIP.

Full text

Turn on search term navigation

Correspondence to Laiyu Liu; [email protected]; Xiaofang Su; [email protected]; Yongzhong Zhan; [email protected]

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Checkpoint inhibitor pneumonitis (CIP) is a severe immune-related adverse event with high morbidity and mortality in patients with cancer receiving immune checkpoint inhibitors. Current diagnostic methods rely on clinical and imaging evaluations, which lack specificity due to overlapping features with infections or other pneumonitis. Lung microbiota dysbiosis has been implicated in immune-mediated pulmonary diseases, but its role in CIP remains unexplored.

WHAT THIS STUDY ADDS

  • This is the first study to characterize lung microbiota profiles in CIP subtypes (pure-type and mixed-type) and pulmonary infection (PI) using metagenomic next-generation sequencing (mNGS). Key findings include distinct microbial signatures (e.g., Porphyromonas, Candida) in CIP, correlations between specific taxa (e.g., Candida abundance with systemic immune-inflammatory markers), and metabolic pathway differences between CIP and PI. A machine learning model (decision tree) based on bronchoalveolar lavage fluid (BALF) microbiota achieved high diagnostic accuracy (area under the curve: 0.88), highlighting its clinical potential.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • The identification of CIP-specific microbiota and their immune-metabolic interactions opens new avenues for mechanistic research into CIP pathogenesis. Clinically, BALF mNGS combined with machine learning could enhance diagnostic precision, reducing reliance on invasive procedures. These findings may guide policies advocating for integrating microbiota profiling into diagnostic workflows for immune-related adverse events, ultimately improving early intervention and personalized management of patients with CIP.

Introduction

Checkpoint inhibitor pneumonitis (CIP), a life-threatening immune-related adverse event, has emerged as a critical concern in cancer immunotherapy due to its significant morbidity and mortality in patients receiving immune checkpoint inhibitors (ICIs).1 2 Excessive immune activation following ICIs treatment is considered a key driver of CIP. Notably, the incidence of CIP can reach up to 13% in patients with non-small cell lung cancer treated with ICIs, underscoring the urgent need for vigilant monitoring and tailored management strategies.3 4 Moreover, the clinical presentations of CIP are highly heterogeneous, ranging from mild respiratory symptoms to severe conditions requiring hospitalization and prompt intervention. Therefore, early identification of CIP is paramount, as timely treatment can significantly reduce adverse outcomes.

However, early recognition of CIP remains challenging due to the limitations of current diagnostic practices. Specifically, traditional reliance on clinical evaluations and imaging modalities, such as high-resolution computed tomography, frequently leads to diagnostic ambiguities due to overlapping symptoms and radiologic characteristics with other pulmonary disorders.5 6 Recent studies have highlighted the potential of machine learning methods in improving diagnostic specificity. A multicenter study demonstrated that radiomic machine learning models significantly outperformed radiologists in distinguishing CIP from other pneumonitis, achieving area under the curve (AUC) values of 0.92.7 This suggests that integrating machine learning into clinical practice could facilitate more accurate and timely diagnosis.

Beyond host mechanisms, emerging evidence has highlighted the role of lung microbiota in modulating immune responses and influencing disease outcomes in various pulmonary conditions.8 9 The lung microbiome represents a complex ecosystem that can profoundly impact local and systemic immune responses. Previous research has identified distinct microbial communities in different pulmonary diseases, suggesting that specific microbial profiles may correlate with disease states and patient outcomes.1 10 Despite these advances, the microbial ecology of CIP remains unknown. Investigating the microbiota of CIP could provide novel insights into its pathogenesis factors, potentially paving the pathway for the development of microbiota-based biomarkers.

In this study, we collected bronchoalveolar lavage fluid (BALF) samples from 38 patients diagnosed with CIP subtypes or pulmonary infection (PI), and used metagenomic next-generation sequencing (mNGS) for profiling. This approach enables comprehensive profiling of the lung microbiota and provides functional insights into the lung environment of CIP. By identifying potential microbial biomarkers, we aim to distinguish CIP from other pulmonary conditions and characterize the microbial signatures associated with different CIP subtypes. Subsequently, we sought to explore the functional implications of these microbial profiles on immune responses. Furthermore, we investigated the diagnostic value of BALF microbiota profiles in patients with CIP using multiple machine-learning models. Through this integrative approach, we hope to contribute to the evolving understanding of CIP pathogenesis and improve clinical management strategies for patients with CIP.

Methods

Participant enrollment and study design

Between January 2018 and December 2023, a total of 103 patients with cancer highly suspected of CIP at Nanfang Hospital, Southern Medical University were recruited into a prospective clinical trial (NCT06192303). All of the participants had not received corticosteroids or antibiotic treatment at the time of enrollment. Among the 103 patients with cancer initially suspected of CIP, 61 underwent bronchoalveolar lavage (BAL) and mNGS. To ensure data quality, all BALF samples were subjected to a stringent multi-step quality control (QC) process covering sample collection, DNA extraction, library preparation, and sequencing. After QC, 23 samples were excluded due to insufficient DNA yield, contamination, or inadequate sequencing depth, leaving 38 high-quality datasets for final microbiome analysis (online supplemental figure S1). The final 38 patients included 14 cases of pure-type CIP (PT-CIP), 14 cases of mixed-type CIP (MT-CIP), and 10 cases of PI.

Comprehensive analysis of the lung microbiome revealed the heterogeneity in lung microbial composition among different patient groups and identified characteristic microbial communities. Functional enrichment analysis further elucidated the biological functions and metabolic reprogramming of these characteristic microbial communities. The Mantel test revealed the significant correlation between microbiome and host immune and inflammatory markers. Finally, we employed multiple machine learning algorithms to construct CIP diagnostic models based on the BALF microbiome and thoroughly evaluated the performance of the models.

This study complies with the Helsinki Declaration. Informed consent has been obtained from all participants.

Diagnostic criteria of CIP subtypes and PI

Patients were classified into PT-CIP, MT-CIP, or PI using a stepwise diagnostic workflow that incorporated BALF mNGS results, treatment response, and multidisciplinary review. Patients with negative mNGS results (n=14) received corticosteroids and all responded favorably, confirming PT-CIP. Patients with positive mNGS results (n=24) were treated with antimicrobial therapy. Of these, 14 patients required additional corticosteroids due to inadequate response to antibiotics, and were classified as MT-CIP. The remaining 10 improved with antibiotics alone and were diagnosed with PI. On day 7, a multidisciplinary team (two pulmonologists and one radiologist) reviewed each case, integrating radiographic findings, treatment response, microbiological results, and inflammatory markers to confirm the final diagnosis. Detailed diagnostic criteria are described in the online supplemental methods.

Severity grading of CIP

CIP severity was classified according to clinical symptoms and radiological extent, adapted from the Common Terminology Criteria for Adverse Events V.5.0 and the European Society for Medical Oncology (ESMO) Clinical Practice Guidelines on the management of toxicities from immunotherapy: Grade 1: asymptomatic; radiographic findings confined to one lung lobe or ≤25% of lung parenchyma. Grade 2: new or worsening symptoms (e.g., cough, dyspnea, chest pain, fever, or mild hypoxia); radiologic involvement of >1 lobe or 25%–50% of lung parenchyma. Grade 3: severe symptoms limiting daily activities, requiring hospitalization or oxygen supplementation; CT showing ≥50% parenchymal involvement. Grade 4: life-threatening respiratory compromise (e.g., respiratory failure requiring urgent interventions such as intubation or mechanical ventilation); extensive bilateral lung involvement on imaging.

BALF sample collection and processing

Adhering to the established protocol for bronchoscopy examination, BAL will be conducted at the designated bronchus in accordance with the patient’s CT findings. Sterile saline at 37°C will be administered twice into the bronchial tree, with each administration varying from 20 to 30 mL. The anticipated recovery rate is projected to be between 30% and 50%. A total of 20 mL of BALF will be gathered in a sterile container and dispatched to the Vision Medicals laboratory within 30 min for mNGS.

IDseq ultra comprehensive targeted pathogen capture metagenomic next-generation sequencing

Following the laboratory’s standard operating procedures, sample processing, nucleic acid extraction, DNA library preparation, high-throughput sequencing, bioinformatics analysis, and interpretation of mNGS data are conducted. To detect pathogens as comprehensively as possible and perform DNA and RNA sequencing simultaneously, total DNA and RNA from all samples are extracted using the QIAamp UCP Pathogen DNA Kit (Qiagen) and the QIAamp Viral RNA Kit (Qiagen), respectively. Human DNA is eliminated using Benzonase (Qiagen) and Tween 20 (Sigma), and rRNA is removed using the Ribo-Zero rRNA Removal Kit (Illumina). Complementary DNA is synthesized through reverse transcription. DNA libraries are constructed using the Nextera XT DNA Library Preparation Kit (Illumina), and library quality is assessed using the Qubit dsDNA HS Assay Kit and High Sensitivity DNA Kit (Agilent) on the Agilent 2100 Bioanalyzer. Sequencing is performed on the Illumina Nextseq CN500. High-quality data is obtained by using Trimmomatic to eliminate low-quality reads, adapter contamination, duplicate reads, and reads shorter than 50 bp. Low-complexity reads are removed using Kcomplexity with default parameters. Reads are mapped to the human reference genome (hg38) using the Burrows-Wheeler Aligner software to identify and exclude human sequence data. The final database comprises approximately 13,000 genomes. Microbial reads are aligned with the database using SNAP V.1.0 beta.18.

Profiling analysis of BALF microbiota

The microbiome metagenomic sequencing data were comprehensively analyzed using MicrobiomeAnalyst V.2.0 (https://www.microbiomeanalyst.ca).11 Following an initial data integrity check, the raw data was processed without applying low-count/low-variance filtering or data normalization. The analysis workflow included alpha-diversity assessment and beta-diversity analysis, where group differences were tested by permutational multivariate analysis of variance (PERMANOVA). Furthermore, the core microbiome was identified, and linear discriminant analysis effect size (LEfSe) was employed to detect differentially abundant taxa. Detailed parameters for all analyses are provided in the online supplemental material.

Functional analysis of BALF microbiota

We employed the metagenomic phylogenetic analysis tool, MetaPhlAn V.3.0.6, to classify metagenomic sequencing reads and generate relative abundances in each sample.12 13 Metagenomic functional analysis was conducted using the HMP Unified Metabolic Analysis Network V.3.0. tool.14 In essence, reads were recruited based on microbial species identified in each sample’s classification analysis to construct a pan-genome sample-specific database containing all identified species. Unmapped reads were aligned against the comprehensive protein database UniRef90 using translated search.15 These alignments were handled in a species-specific manner, weighted by quality and sequence length to estimate gene family abundance. Finally, gene families annotated to metabolic reactions were further analyzed to reconstruct and quantify metabolic pathways in each sample based on MetaCyc.16

To investigate microbial contributions to CIP pathogenesis, we employed a refined analytical approach to mitigate biases inherent in traditional enrichment analyses due to disparities in microbial load. We first applied EdgeR to identify taxa exhibiting significant differences between patients with PT-CIP and PI (p<0.05), with particular attention to taxa enriched in CIP. These CIP-enriched taxa were then subjected to Taxon Set Enrichment Analysis to predict associated metabolite sets. Finally, the predicted metabolites underwent comprehensive pathway enrichment analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG), the Small Molecule Pathway Database (SMPDB), and the Relational Database of Metabolomic Pathways (RaMP-DB).

Pathway enrichment score analysis based on RNA sequencing data from a CIP mouse model

RNA-sequencing data were retrieved from the Gene Expression Omnibus under accession number GSE184000. This dataset originates from a previously published study of a CIP mouse model, in which aged mice were treated with a programmed cell death protein-1 (PD-1) monoclonal antibody to induce the phenotype.17 18 Enrichment scores for predefined metabolic gene sets were calculated from gene expression values, expressed as log-counts per million, using the Pathway Level Analysis of Gene Expression algorithm in the GSVA package in R. The predefined gene sets were obtained from the Gene Ontology (GO) and SMPDB databases. Welch’s t-test was applied to data conforming to normal distribution, while the Wilcoxon rank-sum test was used for data that did not meet the normality assumption.

Construction and validation of CIP diagnostic model based on BALF microbiota and machine learning

The patient cohort was randomly segmented, with 70% designated for the training set to facilitate model development and the remaining 30% assigned for testing. A variety of traditional machine learning algorithms, including decision trees, logistic regression, support vector machines, and K-nearest neighbors, were employed to construct the models. Model complexity was further constrained by conservative hyperparameters (linear kernel for support vector machine, maximum depth=3 for decision tree, k=3 for K-nearest neighbor), encouraging generalizable rather than memorized patterns.

Following training, the models were assessed using the testing set. Receiver operating characteristic (ROC) curves and the AUC were used to evaluate and contrast the classification performance of the different models. An AUC value surpassing 0.80 was considered indicative of a model’s adequate capability to diagnose CIP. Additional metrics, such as accuracy, F1-score, positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and the Brier score, were also used to gage the reliability of the models. The Brier score, which ranges from 0 to 1, measures the divergence between predicted and actual risks, with values closer to 0 signifying superior calibration. Furthermore, the calibration curve was employed to evaluate the congruence between predicted probabilities and actual outcomes. To assess the model’s net benefit, decision curve analysis (DCA) was conducted across various threshold values. The optimal diagnostic model was chosen based on the performance of these evaluation metrics on the testing set.

Features selection and model interpretation

In this study, we used the recursive feature elimination (RFE) technique to select features, intending to improve both the precision and consistency of the model. RFE is a widely employed feature selection approach to identify the most informative subset of features.19

To clarify the mechanism underlying the machine learning model, we applied SHapley Additive exPlanations values (SHAP values).20 SHAP is grounded in SHapley values, which stem from cooperative game theory, and assigns a contribution value to each feature based on its influence on the model output when considered in conjunction with other features. By leveraging SHAP values, we were able to elucidate individual model output by attributing the outcomes to specific features. The aggregate of all SHAP values provided a thorough analysis of the model’s working, thereby addressing the “black box” dilemma.

Statistical analysis

Statistical analysis was conducted using SPSS (V.26) and GraphPad Prism (V.9). For non-normally distributed continuous data, the Mann-Whitney U test and the Kruskal-Wallis test were employed. Categorical data analysis used χ2 or Fisher’s exact test. All tests were two-tailed, with significance denoted by *p<0.05. Preprocessing, building, assessing, interpreting, and visualizing the machine learning models were carried out in Python V.3.10. The feature selection, model construction, and decision tree visualization were implemented using the Scikit-learn V.1.3.0 library. The model interpretation was performed using the SHAP V.0.44.1 library.

Results

Participant characteristics

A total of 38 BALF samples were collected as part of a prospective clinical trial (NCT06192303), including 14 patients with PT-CIP, 14 MT-CIP, and 10 PI. Details of the study design are provided in figure 1. The PI group included two cases of bacterial infection, one case of fungal infection, and seven cases of mixed bacterial-fungal co-infection (online supplemental table 1–3, figure S2). The average ages of the PT-CIP, MT-CIP, and PI groups were 54.5, 62.9, and 56.7 years, respectively. Apart from smoking history, no significant differences were observed among the three groups in terms of tumor stage, history of chronic obstructive pulmonary disease, history of diabetes, history of thoracic radiotherapy, cancer therapy (monotherapy or combined therapy), and the grading of CIP (all p>0.05). Regarding host blood immune and inflammation indicators, C-reactive protein (CRP), procalcitonin (ProCT), white blood cell (WBC), and absolute eosinophil count (AEC) showed no significant differences among the three groups (all p>0.05). However, absolute lymphocyte count (ALC), absolute neutrophil count (ANC), neutrophil-lymphocyte ratio (NLR), platelet-lymphocyte ratio (PLR), monocyte-lymphocyte ratio (MLR), and Systemic Immune Inflammation Index (SII) exhibited significant differences (all p<0.05). Notably, the MT-CIP group had the lowest average ALC level (0.675×109/L) and the highest average levels of NLR, MLR, PLR, and SII, suggesting a heightened systemic inflammatory response and potential immune dysregulation in this subgroup (online supplemental table 1).

View Image - Figure 1. Study workflow. All samples were sequenced, revealing lung microbial diversity and characteristic communities. Functional and relevance analysis showed biological functions and correlations with immune markers. Multiple machine-learning models were used to construct the CIP diagnostic model based on the BALF microbiome. BALF, bronchoalveolar lavage fluid; CIP, checkpoint inhibitor pneumonitis; DT, decision tree; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; KNN, K-nearest neighbors; LR, logistic regression; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP; SVM, support vector machine.

Figure 1. Study workflow. All samples were sequenced, revealing lung microbial diversity and characteristic communities. Functional and relevance analysis showed biological functions and correlations with immune markers. Multiple machine-learning models were used to construct the CIP diagnostic model based on the BALF microbiome. BALF, bronchoalveolar lavage fluid; CIP, checkpoint inhibitor pneumonitis; DT, decision tree; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; KNN, K-nearest neighbors; LR, logistic regression; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP; SVM, support vector machine.

Composition of lung microbiome among patients with PT-CIP, MT-CIP, and PI

Profiling analysis revealed significant heterogeneity in the top 10 most abundant genera among the three groups, with each patient exhibiting substantial diversity in individual microbial compositions (figure 2A). At the genus level, the PT-CIP group was predominantly characterized by Streptococcus, Porphyromonas, Rothia, and Prevotella, while the MT-CIP group was dominated by Candida, Prevotella, Rothia, and Streptococcus. In the PI group, Streptococcus, Rothia, Prevotella, and Gemella were the most abundant genera. We further stratified patients with PI into mixed infection (PI_mixed, n=7) and non-mixed infection subgroups (PI_non_mixed, n=3; combining bacterial-only and fungal-only). Analyses of microbiota abundance, α-diversity and β-diversity showed no significant differences between two groups (all p>0.05), and community structures did not exhibit clear separation (online supplemental figure S3).

View Image - Figure 2. Composition of the lung microbiome in patients with PT-CIP, MT-CIP, and PI. (A) The genus level composition diagrams showed the composition characteristics of the lung microbiome of each patient. The top 10 genus (B) and species level (C) composition diagrams of three groups. Dendrogram analysis of genus level (D) and species level (E) of 38 samples. Distance was measured by Bray-Curtis index. The clustering algorithm was performed by Ward. CIP, checkpoint inhibitor pneumonitis; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP.

Figure 2. Composition of the lung microbiome in patients with PT-CIP, MT-CIP, and PI. (A) The genus level composition diagrams showed the composition characteristics of the lung microbiome of each patient. The top 10 genus (B) and species level (C) composition diagrams of three groups. Dendrogram analysis of genus level (D) and species level (E) of 38 samples. Distance was measured by Bray-Curtis index. The clustering algorithm was performed by Ward. CIP, checkpoint inhibitor pneumonitis; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP.

Notably, prominent species in the MT-CIP group included Pneumocystis jirovecii, Candida tropicalis, and Streptococcus intermedius, while the PT-CIP and PI groups were dominated by bacterial infections, hinting at a heightened risk of fungal infections in patients with MT-CIP (figure 2B,C). Furthermore, dendrogram analysis was performed using Bray-Curtis index and Ward’s linkage clustering algorithm at both the genus and species levels. This analysis revealed varying degrees of heterogeneity among individuals within the same group, highlighting the complexity of lung microbiome across different patients with CIP (figure 2D,E).

Lung microbial diversity analysis

We compared the lung microbiota diversity and community structure among patients with PT-CIP, MT-CIP, and PI groups. The α-diversity was assessed using the Abundance-based Coverage Estimator (ACE) index, Simpson index, Chao1 index, Shannon index, and Observed index (figure 3A–E), all of which revealed no significant differences among the three groups (p>0.05, analysis of variance). Furthermore, we assessed the β-diversity using multiple dimensionality reduction methods, including principal coordinates analysis, Non-metric Multidimensional Scaling, and principal component analysis. Consistently, these analyses demonstrated significant differences in β-diversity among the three groups (p<0.05, PERMANOVA), whereas no significant differences were observed between the PT-CIP and MT-CIP groups (figure 3F, online supplemental figure S4A–F). These results suggest distinct microbial community structures in patients with CIP and PI, while the microbiota composition between patients with PT-CIP and MT-CIP remains largely similar.

View Image - Figure 3. Alpha and beta diversity of lung microbiome among three groups. The alpha diversity was calculated by ACE index (A), Simpson index (B), Chao1 index (C), Shannon index (D), and Observed index (E), which showed no significance among the three groups (p>0.05, ANOVA). (F) The beta diversity of microbiota among the three groups was assessed by PCoA (p<0.05, PERMANOVA). ACE, abundance-based coverage estimator; ANOVA, analysis of variance; CIP, checkpoint inhibitor pneumonitis; MT-CIP, mixed-type CIP; PCoA, principal coordinates analysis; PERMANOVA, permutational multivariate analysis of variance; PI, pulmonary infection; PT-CIP, pure-type CIP.

Figure 3. Alpha and beta diversity of lung microbiome among three groups. The alpha diversity was calculated by ACE index (A), Simpson index (B), Chao1 index (C), Shannon index (D), and Observed index (E), which showed no significance among the three groups (p>0.05, ANOVA). (F) The beta diversity of microbiota among the three groups was assessed by PCoA (p<0.05, PERMANOVA). ACE, abundance-based coverage estimator; ANOVA, analysis of variance; CIP, checkpoint inhibitor pneumonitis; MT-CIP, mixed-type CIP; PCoA, principal coordinates analysis; PERMANOVA, permutational multivariate analysis of variance; PI, pulmonary infection; PT-CIP, pure-type CIP.

Identification of core microbiome and signature microbiome

The core microbiome, which refers to the specific set of microbial taxa consistently detected in a significant fraction of the population above a certain abundance threshold, plays a crucial role in understanding the overall health and ecological balance of microbial communities. We set a relative abundance of 0.01% as the detection threshold to identify the core microbiome in each group. Results indicated that Streptococcus, Prevotella, Rothia, and Porphyromonas were considered as core microbiome in the PT-CIP group (figure 4A). Prevotella, Streptococcus, and Rothia were identified as core microbiome in the MT-CIP group (figure 4B). In the PI group, the core microbiome mainly included Talaromyces, Streptococcus, Prevotella, and Porphyromonas (figure 4C). These findings suggest that the core microbiota of the three groups exhibit overall similarity, with subtle differences.

View Image - Figure 4. Identification of lung signature microbiota and correlation with host immune and inflammatory markers. The core microbiome composition of PT-CIP (A), MT-CIP (B), and PI (C) patients. The core microbiome refers to the set of taxa that are detected in a high fraction of the population above a given abundance threshold. The count data was transformed to compositional abundances. (D) LEfSe was employed to identify signature microbiota among the three groups. (E) Associations between lung signature microbiota and host immune and inflammatory markers. AEC, absolute eosinophil count; ALC, absolute lymphocyte count; ANC, absolute neutrophil count; CIP, checkpoint inhibitor pneumonitis; LDA score, Linear Discriminant Analysis score; LEfSe, linear discriminant analysis effect size; MLR, monocyte-lymphocyte ratio; MT-CIP, mixed-type CIP; NLR, neutrophil-lymphocyte ratio; PI, pulmonary infection; PLR, platelet-lymphocyte ratio; ProCT, procalcitonin; PT-CIP, pure-type CIP; SII, Systemic Immune Inflammation Index; WBC, white blood cell.

Figure 4. Identification of lung signature microbiota and correlation with host immune and inflammatory markers. The core microbiome composition of PT-CIP (A), MT-CIP (B), and PI (C) patients. The core microbiome refers to the set of taxa that are detected in a high fraction of the population above a given abundance threshold. The count data was transformed to compositional abundances. (D) LEfSe was employed to identify signature microbiota among the three groups. (E) Associations between lung signature microbiota and host immune and inflammatory markers. AEC, absolute eosinophil count; ALC, absolute lymphocyte count; ANC, absolute neutrophil count; CIP, checkpoint inhibitor pneumonitis; LDA score, Linear Discriminant Analysis score; LEfSe, linear discriminant analysis effect size; MLR, monocyte-lymphocyte ratio; MT-CIP, mixed-type CIP; NLR, neutrophil-lymphocyte ratio; PI, pulmonary infection; PLR, platelet-lymphocyte ratio; ProCT, procalcitonin; PT-CIP, pure-type CIP; SII, Systemic Immune Inflammation Index; WBC, white blood cell.

Beyond identifying the core microbiome, we applied the LEfSe algorithm to identify differential signature microbiomes. A total of 11 significant microbial features were identified with the given criteria (p value cut-off: 0.05, log Linear Discriminant Analysis score: 1). To further visualize differences in microbial composition, we plotted the relative abundances of all 11 signature genera identified by LEfSe across the three groups. These distributions reinforced the diagnostic relevance of the LEfSe-derived microbial signatures and highlighted the distinct microbial landscapes associated with immune-mediated versus infectious pulmonary events (figure 4D, online supplemental figure S5A–K).

Correlation analysis between signature microbiota and host immune-inflammatory markers

Given the critical role of microbiota in modulating immune-inflammatory responses, we applied the Mantel test to assess the relationship between signature microbiota and host immune-inflammatory markers (figure 4E, online supplemental figure S6, online supplemental table 4). We found a significant negative correlation of Porphyromonas and Solobacterium abundance with CRP and ProCT levels (p<0.05, Mantel test). In contrast, none of the characteristic microbiota showed significant correlations with WBC and AEC (p>0.05, mantel test). Interestingly, the abundance of Candida was negatively correlated with ALC but positively correlated with ANC (p<0.05, Mantel test), suggesting that Candida infection may primarily activate host immune responses through neutrophils rather than lymphocytes. Furthermore, Candida, Solobacterium, and Porphyromonas demonstrated varying degrees of correlations with NLR, MLR, PLR, and SII indices (p<0.05, Mantel test), indicating their potential involvement in systemic immune-inflammatory regulation. Taken together, these findings suggest that different microbial taxa exhibit distinct immune interaction patterns, indicating their potential contributions to the immune-inflammatory mechanisms underlying CIP.

Functional analysis of characteristic microbiome in patients with CIP and PI

To refine our understanding of microbiota differences, we further investigated the differential microbiota between each pair of groups. The results revealed that the fewest differential microbiota was observed between the PT-CIP and MT-CIP groups, suggesting a high degree of microbiota similarity between these two CIP subgroups, which poses a challenge for distinguishing them based solely on microbiome composition (figure 5A,B).

View Image - Figure 5. Functional analysis of differential microbiome between the PT-CIP and the PI group. (A) Venn diagram of differential microbiome among the three groups. (B) The volcano plot revealed differential microbiome among PT-CIP, MT-CIP, and PI groups. The top five most significantly different microbial taxa were highlighted with circles (top five up and down). GO function analysis (C), KEGG orthology analysis (D), and MetaCyc pathway analysis (E) of differential microbiome between the PT-CIP and the PI group. CIP, checkpoint inhibitor pneumonitis; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; LDA, Linear Discriminant Analysis; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP; TCA, tricarboxylic acid.

Figure 5. Functional analysis of differential microbiome between the PT-CIP and the PI group. (A) Venn diagram of differential microbiome among the three groups. (B) The volcano plot revealed differential microbiome among PT-CIP, MT-CIP, and PI groups. The top five most significantly different microbial taxa were highlighted with circles (top five up and down). GO function analysis (C), KEGG orthology analysis (D), and MetaCyc pathway analysis (E) of differential microbiome between the PT-CIP and the PI group. CIP, checkpoint inhibitor pneumonitis; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; LDA, Linear Discriminant Analysis; MT-CIP, mixed-type CIP; PI, pulmonary infection; PT-CIP, pure-type CIP; TCA, tricarboxylic acid.

To gain insight into the biological functional differences and metabolic reprogramming in the lung microbiota between patients with CIP and PI, we conducted comprehensive metagenomic-based functional annotation on 14 patients with PT-CIP and 10 PI. Compared with patients with PT-CIP, patients with PI exhibited a more diverse array of biological functions and metabolic alterations in their lung microbiota. The GO analysis suggested significant enrichment of biological pathways in patients with PI, such as transmembrane transport, plasma membrane, sequence-specific DNA binding, hydrolase activity, DNA-binding transcription factor activity, transmembrane transporter activity, and cell outer membrane (figure 5C). Similarly, the KEGG Orthology analysis further revealed metabolic reprogramming between patients with CIP and PI, with pathways such as 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase, asparagine-oxo-acid transaminase, outer membrane protein assembly factor BamE, and monofunctional glycosyltransferase showing significant activation in patients with PI (figure 5D). Lastly, MetaCyc pathways enrichment analysis highlighted significant differences in pathways such as adenosine ribonucleotides de novo biosynthesis, tricarboxylic acid (TCA) cycle I (prokaryotic), superpathway of heme b biosynthesis from glutamate, and tetrapyrrole biosynthesis II (from glycine) between patients with CIP and PI (figure 5E).

To elucidate the immunometabolic reprogramming potentially driven by microbiota in CIP, we performed functional analysis on metabolite sets associated with CIP-enriched microbiota. This analysis revealed activation of glutathione metabolism pathways (online supplemental figure S7A) and significant enrichment of the Warburg effect (online supplemental figure S7B). Additionally, we observed upregulation of protein metabolism pathways with marked increases in amino acid transport across plasma membranes via solute carrier (SLC) family transporters (online supplemental figure S7A,C). To corroborate our findings, we performed an additional validation analysis using publicly available RNA-sequencing data from a reported CIP mouse model built on aged mice treated with anti-PD-1 monoclonal antibody. Our re-analysis of this independent dataset confirmed the upregulation of the Warburg effect, glutathione metabolism, and protein translation pathways in CIP mouse model (online supplemental figure S8).

Construction and validation of machine learning-based diagnostic model for CIP

Given the growing interest in microbiota-based diagnostics, we explored the potential of BALF microbiota profiling for CIP classification by constructing a diagnostic model using machine learning algorithms. Initially, we combined patients with PT-CIP and MT-CIP into the CIP group (n=28), with patients with PI serving as the control group (n=10). The cohort was randomly split into a training set (70%) and a testing set (30%). We adopted the RFE technique to identify key predictive features, systematically removing irrelevant variables to pinpoint the most informative feature subset.

Subsequently, we employed several commonly used machine learning algorithms, including decision trees, logistic regression, support vector machines, and K-nearest neighbors, to build the diagnostic model. After training, the models underwent rigorous assessment on the testing set to ensure their effectiveness and reliability. ROC shows that the decision tree model exhibited the best discriminative performance in the testing set, achieving an AUC of 0.88 (figure 6A,B). Afterward, the calibration curve was employed to evaluate the agreement between predicted and observed outcomes. Similarly, the decision tree model exhibited the lowest Brier score, indicating minimal disparity between predicted risk and actual risk, thus demonstrating superior calibration performance (figure 6C,D). In addition to AUC, a range of additional metrics, including accuracy, F1-score, PPV, NPV, sensitivity, and specificity, were also employed to comprehensively evaluate the reliability and robustness of the models. The results indicated that the decision tree model outperformed other models in the testing set (figure 6E,F). Furthermore, DCA was performed to assess the models’ net benefit across various threshold values, providing insights into their clinical applicability. Notably, the decision tree model demonstrated the highest net benefit for threshold probabilities below 0.8 (figure 6G,H). A fivefold cross-validation further confirmed the stability and generalizability of our model, with the decision tree achieving consistent performance across all folds (mean AUC=0.909±0.129), favorable calibration (Brier score=0.079±0.102), and the highest net clinical benefit in DCA (online supplemental figure S9).

View Image - Figure 6. Construction and validation of a CIP diagnostic model based on machine learning. ROC analysis of the KNN, SVM, LR, and DT model in the training set (A) and the testing set (B). Calibration curve of the KNN, SVM, LR, and DT model in the training set (C) and the testing set (D). Evaluation indicators of the performance of the KNN, SVM, LR, and DT model in the training set (E) and the testing set (F). DCA of the KNN, SVM, LR, and DT model in the training set (G) and the testing set (H). AUC, area under the curve; CIP, checkpoint inhibitor pneumonitis; DT, decision tree; DCA, decision curve analysis; KNN, K-nearest neighbors; LR, logistic regression; NPV, negative predictive value; PPV, positive predictive value; ROC, receiver operating characteristic curve; SVM, support vector machine.

Figure 6. Construction and validation of a CIP diagnostic model based on machine learning. ROC analysis of the KNN, SVM, LR, and DT model in the training set (A) and the testing set (B). Calibration curve of the KNN, SVM, LR, and DT model in the training set (C) and the testing set (D). Evaluation indicators of the performance of the KNN, SVM, LR, and DT model in the training set (E) and the testing set (F). DCA of the KNN, SVM, LR, and DT model in the training set (G) and the testing set (H). AUC, area under the curve; CIP, checkpoint inhibitor pneumonitis; DT, decision tree; DCA, decision curve analysis; KNN, K-nearest neighbors; LR, logistic regression; NPV, negative predictive value; PPV, positive predictive value; ROC, receiver operating characteristic curve; SVM, support vector machine.

Given its superior performance across multiple evaluation metrics, the decision tree model was ultimately selected as the optimal diagnostic model for CIP. Funiculosus and Forsythia were identified as the key predictive features in the decision tree model. To demonstrate its internal workings, we visualized the decision tree model, highlighting its vast potential for widespread application in clinical practice (figure 7A). To further explain the decision tree model, we applied the SHAP method, which quantified each feature’s contribution to model output. This approach provides two levels of interpretability: a global feature-level explanation and a local case-specific explanation. As shown in figure 7B, we evaluated the contribution of each feature to the model by displaying the mean SHAP values. Furthermore, we presented a visual representation of the direction and strength of each feature’s influence on the model’s output (figure 7C). Elevated levels of Funiculosus and Forsythia were associated with a higher likelihood of PI diagnosis. What is more, we were also able to clarify individual diagnostic results by attributing the results to specific features (figure 7D–G). In conclusion, we developed and validated a CIP diagnostic model based on the BALF microbiota profile using the decision tree algorithm. Our findings highlight the significant value of lung microbiota in CIP diagnosis and demonstrate the potential of this model for clinical application.

View Image - Figure 7. Visualization and interpretation of the CIP diagnostic model based on the decision tree. (A) The specific process of the CIP diagnostic model based on the decision tree. The Gini index is a method used to measure the purity of decision tree split nodes, with smaller values indicating higher purity. (B) The SHAP summary bar plot displays the contribution of each feature to the decision tree model based on the mean SHAP values. (C) The SHAP summary dot plot illustrates each feature’s impact on the model. The probability of a diagnosis of CIP correlates with the SHAP values of the features. Each dot corresponds to an individual patient’s SHAP value for a specific feature, with red representing higher feature values and blue indicating lower ones. The dots are arranged vertically to represent the density distribution. (D–F) SHAP waterfall plots show the contribution of two microbe features in two representative patients diagnosed with PI. (F–G) SHAP waterfall plots show the contribution of two microbe features in two representative patients diagnosed with CIP. CIP, checkpoint inhibitor pneumonitis; PI, pulmonary infection; SHAP, SHapley Additive exPlanations.

Figure 7. Visualization and interpretation of the CIP diagnostic model based on the decision tree. (A) The specific process of the CIP diagnostic model based on the decision tree. The Gini index is a method used to measure the purity of decision tree split nodes, with smaller values indicating higher purity. (B) The SHAP summary bar plot displays the contribution of each feature to the decision tree model based on the mean SHAP values. (C) The SHAP summary dot plot illustrates each feature’s impact on the model. The probability of a diagnosis of CIP correlates with the SHAP values of the features. Each dot corresponds to an individual patient’s SHAP value for a specific feature, with red representing higher feature values and blue indicating lower ones. The dots are arranged vertically to represent the density distribution. (D–F) SHAP waterfall plots show the contribution of two microbe features in two representative patients diagnosed with PI. (F–G) SHAP waterfall plots show the contribution of two microbe features in two representative patients diagnosed with CIP. CIP, checkpoint inhibitor pneumonitis; PI, pulmonary infection; SHAP, SHapley Additive exPlanations.

Discussion

CIP is a life-threatening immune-related toxicity with clinical and radiologic overlap with infections, complicating diagnosis.21 Its incidence is increasing with immunotherapy use, urgently necessitating improved diagnostics. This study profiles lung microbiota across CIP subtypes and builds a microbiota-based machine-learning diagnostic model, illuminating microbial-immune interactions driving CIP progression.

Our findings align with emerging evidence that lung microbiome dysbiosis plays a regulatory role in immune-mediated pulmonary disorders.22 The predominance of Porphyromonas in PT-CIP is particularly noteworthy, as this genus has been implicated in chronic inflammation through its capacity to produce proteolytic enzymes and modulate neutrophil responses.23 The negative correlation between Porphyromonas abundance and CRP or procalcitonin levels (p<0.05) suggests potential immunosuppressive properties, possibly through induction of regulatory T cells.24 Conversely, the positive association of Candida with NLR and SII supports its pro-inflammatory role, consistent with recent findings in autoimmune pulmonary conditions.25 However, several immune-inflammatory markers lack pathogen specificity and may reflect general immune activation rather than infection. Thus, these correlations should be interpreted cautiously as exploratory, hypothesis-generating findings.

Recent evidence demonstrates that microbial metabolites can directly modulate T-cell function. Consistent with this, our study delineates a comprehensive metabolic landscape of CIP, potentially shaped by microbiota. Central to this landscape is the marked enrichment of aerobic glycolysis—the Warburg effect—essential for rapid ATP generation and the effector functions of inflammatory cells. Concurrently, protein metabolism pathways were strongly upregulated, reflecting a cellular state primed for activation and inflammatory mediator synthesis. Enhanced glutathione metabolism indicated a robust antioxidant defense, enabling immune cells to survive oxidative stress from reactive oxygen species. Furthermore, validation in a robust preclinical model supports that microbiome-driven immunometabolic reprogramming represents a fundamental feature of CIP pathogenesis. Collectively, our results highlight the microbiota’s pivotal role in modulating CIP-associated immunometabolic reprogramming, thereby contributing to the pathogenesis of CIP. A deeper understanding of this interplay may inform future therapeutic strategies focused on manipulating the lung microbiome to enhance clinical outcomes for patients with CIP.26

A machine learning diagnostic model integrating microbiota profiling shows strong potential for CIP diagnosis. The decision tree achieved excellent performance, leveraging taxa like Funiculosus and Forsythia to outperform imaging and biomarkers. Yet, broader validation in larger cohorts is essential, as the current small sample limits clinical generalizability. Notably, Barzon et al emphasize the importance of multiomics integration for biomarker discovery.27 Expanding our model to incorporate host transcriptomic or metabolomic data may enhance diagnostic precision and uncover underlying mechanisms. Moreover, this finding complements recent work demonstrating gut microbiome signatures predict immunotherapy response and immune-related colitis, extending the diagnostic potential of microbiome analysis to pulmonary complications.28 29

To translate our findings into a tangible clinical tool, we propose a structured diagnostic and management framework (online supplemental figure S10). The framework illustrates how BALF mNGS can serve as an initial critical test in patients with suspected CIP. A negative result strongly supports a diagnosis of PT-CIP, allowing for confident initiation of corticosteroid therapy. The major diagnostic challenge arises when pathogens are detected, where it is difficult to distinguish MT-CIP from PI. At this juncture, our decision tree model provides additional discriminatory power to resolve this ambiguity and guide clinicians toward the correct diagnosis. Importantly, the framework directly links diagnostic categories with tailored treatment strategies: corticosteroids for PT-CIP, antibiotics for PI, and combination therapy for MT-CIP. Moreover, microbiome profiling offers the potential to inform pathogen-specific antibiotic use in PI and MT-CIP, thereby reducing reliance on broad-spectrum antibiotics and minimizing treatment-related complications. By improving diagnostic precision, the framework may ultimately shorten hospitalization, optimize therapeutic decisions, and reduce the socioeconomic burden of CIP, underscoring the translational value of microbiome-based approaches in clinical practice.

Several limitations should be acknowledged. First, the single-center design and modest sample size restrict the generalizability of our findings. To address this, we are preparing a multicenter trial that will validate both the microbial signature panel and the diagnostic model in a larger, more diverse cohort. Second, the lack of longitudinal data precludes assessment of microbiota dynamics throughout disease progression. Third, although we excluded recent antibiotic users, prior treatment histories may have influenced baseline microbiota composition. Future multicenter studies incorporating serial sampling and germ-free animal models could clarify causal relationships between specific microbiota and CIP pathogenesis. Nevertheless, although our in vivo transcriptomic validation provides strong evidence for the immunometabolic reprogramming signatures, direct functional experiments—such as in vitro metabolic flux analyses or enzymatic assays—would be valuable for mechanistic dissection of these pathways.

In conclusion, our study reveals distinct lung microbiome differences between CIP and PI, highlighting microbiome profiling’s clinical potential. A machine learning model using BALF microbiota shows promise for improving CIP diagnosis and management. Future work should validate findings in larger cohorts and integrate clinical and multiomics data to refine models, enhance decision-making, and improve patient outcomes.

Footnote

ZZ, JLin, JLi and XH contributed equally.

Contributors ZZ, J-RL, and JLi: conceptualization and study design. ZZ and J-R L: formal analysis and writing—original draft. ZZ and J-RL: visualisationvisualization. JLi, XH, JH, LY, WX, JLu, WH, SH, DY, HZ, XG, ML, YM, and FY: investigation and methodology. ZZ, J-RL, JLi, and Z-KC: data curation and analysis. ZZ and J-RL: software and statistical analysis. LL, YZ, XS, and Z-KC: writing—review and editing. ZZ, YZ, and LL have directly accessed and verified the underlying data reported in the manuscript. All authors have reviewed and approved the final manuscript. LL is the guarantor.

Funding The study is jointly funded by the Natural Science Foundation of Guangdong Province (NO.2024A1515012908 and NO.2023A1515010285), Clinical Research Program of Nanfang Hospital Southern Medical University (2022CR011, and 2022CR013), Medical Scientific Research Foundation of Guangdong Province (B2021449), President Foundation of Nanfang Hospital, Southern Medical University (2023B046), Science and Technology Program of Guangzhou (NO.2025A04J4104 and No.2025A04J4090). The funders had no role in study design, data collection, analysis, interpretation, manuscript preparation, or decision to submit for publication. No additional writing assistance, article processing charges, or other non-financial support were received.

Competing interests None declared.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

References

1 Chuzi S, Tavora F, Cruz M, et al. Clinical features, diagnostic challenges, and management strategies in checkpoint inhibitor-related pneumonitis. Cancer Manag Res 2017;9:207–13. doi:10.2147/CMAR.S136818

2 Iwai T, Sugimoto M, Patel H, et al. Anti-VEGF Antibody Protects against Alveolar Exudate Leakage Caused by Vascular Hyperpermeability, Resulting in Mitigation of Pneumonitis Induced by Immunotherapy. Mol Cancer Ther 2021;20:2519–26. doi:10.1158/1535-7163.MCT-21-0031

3 Abdel-Rahman O, Fouad M. Risk of pneumonitis in cancer patients treated with immune checkpoint inhibitors: a meta-analysis. Ther Adv Respir Dis 2016;10:183–93. doi:10.1177/1753465816636557

4 Küçükarda A, Gökmen İ, Özcan E, et al. Recurrent Delayed Immune-Related Pneumonitis After Immune-Checkpoint Inhibitor Therapy for Advanced Osteosarcoma. Immunotherapy (Los Angel) 2022;14:395–9. doi:10.2217/imt-2021-0275

5 Zhang A, Yang F, Gao L, et al. Research Progress on Radiotherapy Combined with Immunotherapy for Associated Pneumonitis During Treatment of Non-Small Cell Lung Cancer. Cancer Manag Res 2022;14:2469–83. doi:10.2147/CMAR.S374648

6 Tabchi S, Messier C, Blais N. Immune-mediated respiratory adverse events of checkpoint inhibitors. Curr Opin Oncol 2016;28:269–77. doi:10.1097/CCO.0000000000000291

7 Hindocha S, Hunter B, Linton-Reid K, et al. Validated machine learning tools to distinguish immune checkpoint inhibitor, radiotherapy, COVID-19 and other infective pneumonitis. Radiother Oncol 2024;195:110266. doi:10.1016/j.radonc.2024.110266

8 Hong BY, Maulén NP, Adami AJ, et al. Microbiome Changes during Tuberculosis and Antituberculous Therapy. Clin Microbiol Rev 2016;29:915–26. doi:10.1128/CMR.00096-15

9 Wang L, Hao K, Yang T, et al. Role of the Lung Microbiome in the Pathogenesis of Chronic Obstructive Pulmonary Disease. Chin Med J 2017;130:2107–11. doi:10.4103/0366-6999.211452

10 Sethi S. Chronic obstructive pulmonary disease and infection. Disruption of the microbiome? Ann Am Thorac Soc 2014;11 Suppl 1:S43–7. doi:10.1513/AnnalsATS.201307-212MG

11 Lu Y, Zhou G, Ewald J, et al. MicrobiomeAnalyst 2.0: comprehensive statistical, functional and integrative analysis of microbiome data. Nucleic Acids Res 2023;51:W310–8. doi:10.1093/nar/gkad407

12 Blanco-Míguez A, Beghini F, Cumbo F, et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat Biotechnol 2023;41:1633–44. doi:10.1038/s41587-023-01688-w

13 Truong DT, Tett A, Pasolli E, et al. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res 2017;27:626–38. doi:10.1101/gr.216242.116

14 Beghini F, McIver LJ, Blanco-Míguez A, et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife 2021;10:e65088. doi:10.7554/eLife.65088

15 Suzek BE, Wang Y, Huang H, et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 2015;31:926–32. doi:10.1093/bioinformatics/btu739

16 Caspi R, Billington R, Keseler IM, et al. The MetaCyc database of metabolic pathways and enzymes - a 2019 update. Nucleic Acids Res 2020;48:D445–53. doi:10.1093/nar/gkz862

17 Tsukamoto H, Komohara Y, Tomita Y, et al. Aging-associated and CD4 T-cell-dependent ectopic CXCL13 activation predisposes to anti-PD-1 therapy-induced adverse events. Proc Natl Acad Sci U S A 2022;119:e2205378119. doi:10.1073/pnas.2205378119

18 Yokoi M, Murakami K, Yaguchi T, et al. ICOS+CD4+ T cells define a high susceptibility to anti-PD-1 therapy-induced lung pathogenesis. JCI Insight 2025;10:e186483. doi:10.1172/jci.insight.186483

19 Escanilla NS, Hellerstein L, Kleiman R, et al. Recursive Feature Elimination by Sensitivity Testing. Proc Int Conf Mach Learn Appl 2018;2018:40–7. doi:10.1109/ICMLA.2018.00014

20 Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems; Red Hook, NY, USA: Curran Associates Inc, 2017:4768–77.

21 Suresh K, Naidoo J, Zhong Q, et al. The alveolar immune cell landscape is dysregulated in checkpoint inhibitor pneumonitis. J Clin Invest 2019;129:4305–15. doi:10.1172/JCI128654

22 Segal LN, Clemente JC, Wu BG, et al. Randomised, double-blind, placebo-controlled trial with azithromycin selects for anti-inflammatory microbial metabolites in the emphysematous lung. Thorax 2017;72:13–22. doi:10.1136/thoraxjnl-2016-208599

23 Gaeckle NT, Pragman AA, Pendleton KM, et al. The Oral-Lung Axis: The Impact of Oral Health on Lung Health. Respir Care 2020;65:1211–20. doi:10.4187/respcare.07332

24 Mager LF, Burkhard R, Pett N, et al. Microbiome-derived inosine modulates response to checkpoint inhibitor immunotherapy. Science 2020;369:1481–9. doi:10.1126/science.abc3421

25 Shao T-Y, Ang WXG, Jiang TT, et al. Commensal Candida albicans Positively Calibrates Systemic Th17 Immunological Responses. Cell Host Microbe 2019;25:404–17. doi:10.1016/j.chom.2019.02.004

26 Yu W, Wang K, He Y, et al. The potential role of lung microbiota and lauroylcarnitine in T-cell activation associated with checkpoint inhibitor pneumonitis. EBioMedicine 2024;106:105267. doi:10.1016/j.ebiom.2024.105267

27 Barzon L, Lavezzo E, Militello V, et al. Applications of next-generation sequencing technologies to diagnostic virology. Int J Mol Sci 2011;12:7861–84. doi:10.3390/ijms12117861

28 Routy B, Le Chatelier E, Derosa L, et al. Gut microbiome influences efficacy of PD-1-based immunotherapy against epithelial tumors. Science 2018;359:91–7. doi:10.1126/science.aan3706

29 Chaput N, Lepage P, Coutzac C, et al. Baseline gut microbiota predicts clinical response and colitis in metastatic melanoma patients treated with ipilimumab. Ann Oncol 2017;28:1368–79. doi:10.1093/annonc/mdx108

© Author(s) (or their employer(s)) 2025. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ Group.. This work is licensed under the Creative Commons  Attribution – Non-Commercial License http://creativecommons.org/licenses/by-nc/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.