Introduction
The ongoing COVID-19 pandemic challenges health-care systems. As of December 2021, the John Hopkins dashboard (Dong et al., 2020) reports 276 million cases and 5.4 million COVID-19-related deaths worldwide (Johns Hopkins Coronavirus Resource Center, 2021). Although the vast majority of COVID-19 patients display mild disease, approximately 10–15% of cases progress to a severe condition and approximately 5% suffer from critical illness (Perez-Saez, 2021; Huang et al., 2020). Similar to severe acute respiratory syndrome (SARS) (Hui et al., 2005; Ng et al., 2004; Ngai et al., 2010; Lam et al., 2009), a significant portion of COVID-19 patients report lingering or recurring clinical impairment and cardiopulmonary recovery may take several months to years (Sonnweber et al., 2021; Sahanic et al., 2021; Caruso et al., 2021; Huang et al., 2021b; Huang et al., 2021a; Faverio et al., 2021; Hellemons et al., 2021; Zhou et al., 2021; Venkatesan, 2021). This observation has led to the introduction of the term ‘long COVID,’ defined by the persistence of COVID-19 symptoms for more than 4 weeks, and the ‘post-acute sequelae of COVID-19’ (PASC) referring to symptom persistence for more than 12 weeks (Sahanic et al., 2021; Shah et al., 2021; Sudre et al., 2021b). Evidence-based strategies for prediction, monitoring, and treatment of PASC are urgently needed (Raghu and Wilson, 2020).
We herein prospectively analyzed the prevalence of nonresolving structural and functional lung abnormalities and persistent COVID-19-related symptoms 6 months after diagnosis. Using univariate risk modeling as well as multiparameter clustering and machine learning (ML), we investigated sets of risk factors and tested the operability of ML classifiers at predicting protracted lung and symptom recovery. The classification and prediction procedures were implemented in an open-source risk assessment tool (https://im2-ibk.shinyapps.io/CovILD/).
Methods
Study design
The CovILD (‘Development of interstitial lung disease in COVID-19’) multicenter, longitudinal observational study (Sonnweber et al., 2021) was initiated in April 2020. Adult residents of Tyrol, Austria, with symptomatic, PCR-confirmed SARS-CoV-2 infection (WHO, 2021) were enrolled by the Department of Internal Medicine II at the Medical University of Innsbruck (primary follow-up center), St. Vinzenz Hospital in Zams, and the acute rehabilitation facility in Münster (Table 1). The participants were diagnosed with COVID-19 between 3 March and 29 June 2020. In course of the study, including the 2020 SARS-CoV-2 outbreak and follow-up visits, the regional health system was able to guarantee an unrestricted, optimal standard of diagnostics and care for all participants. Corticosteroids were not standard of care during the recruitment period of the study, thus were not administered as a therapy of acute COVID-19. Some participants with nonresolving pneumonia received systemic steroids beginning from week 4 post diagnosis at the discretion of the physician (Table 2). The analysis endpoints were the presence of any, mild (severity score ≤ 5), and moderate-to-severe (severity score > 5) lung computed tomography (CT) abnormalities, impaired lung function (LF), and persistent COVID-19 symptoms at the 180-day follow-up visit (Table 3).
Table 1.
Characteristics of the study population.
Characteristics (% cohort) | |
---|---|
Total participants – no. | 145 |
Mean age, years | 57.3 (SD = 14.3) |
Female sex | 42.4% (n = 63) |
Obesity (body mass index >30 kg/m2) | 19.3% (n = 28) |
Ex-smoker | 39.3% (n = 57) |
Active smoker | 2.8% (n = 4) |
Acute COVID-19 severity (% cohort) | |
Mild: outpatient | 24.8% (n = 36) |
Moderate: inpatient without oxygen therapy | 25.5% (n = 37) |
Severe: inpatient with oxygen therapy | 27.6% (n = 40) |
Critical: intensive care unit | 22.1% (n = 32) |
Comorbidities (% cohort) | |
None | 22.8% (n = 33) |
Cardiovascular disease | 40% (n = 58) |
Pulmonary disease | 18.6% (n = 27) |
Metabolic disease | 43.4% (n = 63) |
Chronic kidney disease | 6.9% (n = 10) |
Gastrointestinal tract diseases | 13.8% (n = 20) |
Malignancy | 11.7% (n = 17) |
Table 2.
Hospitalization and medication during acute COVID-19.
Parameter | Outpatient (n = 36) | Hospitalized (n = 37) | Hospitalized oxygen therapy (n = 40) | Hospitalized intensive care unit (n = 32) |
---|---|---|---|---|
Mean hospitalization time, days | 0 (SD = 0) | 6.9 (SD = 3.6) | 11.8 (SD = 6.3) | 34.8 (SD = 15.7) |
Hospitalized >7 days | 0% (n = 0) | 43.2% (n = 16) | 80% (n = 32) | 100% (n = 32) |
Anti-infectives | 11.1% (n = 4) | 45.9% (n = 17) | 72.5% (n = 29) | 87.5% (n = 28) |
Antiplatelet drugs | 2.8% (n = 1) | 10.8% (n = 4) | 22.5% (n = 9) | 25% (n = 8) |
Anticoagulatives | 2.8% (n = 1) | 2.7% (n = 1) | 5% (n = 2) | 15.6% (n = 5) |
Corticosteroids*† | 2.8% (n = 1) | 5.4% (n = 2) | 22.5% (n = 9) | 40.6% (n = 13) |
Immunosuppression‡† | 0% (n = 0) | 2.7% (n = 1) | 5% (n = 2) | 9.4% (n = 3) |
*
From the week 4 post diagnosis on, at the discretion of the physician.
†
Subsumed under ‘immunosuppression, acute COVID-19’ for data analysis.
‡
Immunosuppressive medication prior to COVID-19.
Table 3.
Radiological, functional, and clinical study outcomes.
Outcome | 60-day follow-up | 100-day follow-up | 180-day follow-up |
---|---|---|---|
Any lung CT abnormalities (complete: n = 103) | 74.8% (n = 77) | 60.2% (n = 62) | 48.5% (n = 50) |
Mild lung CT abnormalities (severity score ≤ 5) (complete: n = 103) | 26.2% (n = 27) | 36.9% (n = 38) | 29.1% (n = 30) |
Moderate-to-severe CT abnormalities (severity score > 5) (complete: n = 103) | 48.5% (n = 50) | 23.3% (n = 24) | 19.4% (n = 20) |
Functional lung impairment (complete: n = 116) | 39.7% (n = 46) | 37.1% (n = 43) | 33.6% (n = 39) |
Persistent symptoms (complete: n = 145) | 79.3% (n = 115) | 67.6% (n = 98) | 49% (n = 71) |
CT = computed tomography.
In total, 190 COVID-19 patients were screened for participation. Thereof, n = 18 subjects refused to give informed consent, n = 27 declared difficulties to appear at the study follow-ups. Data of n = 145 participants were eligible for analysis (Figure 1). All participants gave written informed consent. The study was approved by the Institutional Review Board at the Medical University of Innsbruck (approval number: 1103/2020) and registered at ClinicalTrials.gov (NCT04416100).
Procedures
We retrospectively assessed patient characteristics during acute COVID-19 and performed follow-up investigations at 60 days (63 ± 23 days [mean ± SD]; visit 1), 100 days (103 ± 21 days; visit 2), and 180 days (190 ± 15 days; visit 3) after diagnosis of COVID-19. Each visit included symptom and physical performance assessment with a standardized questionnaire, LF testing, standard laboratory testing, and a CT scan of the chest. The variables available for analysis with their stratification schemes are listed in Appendix 1—table 1.
Serological markers were determined in certified laboratories (Central Institute of Clinical and Chemical Laboratory Diagnostics, Rheumatology and Infectious Diseases Laboratory, both at the University Hospital of Innsbruck). C-reactive protein (CRP), interleukin-6 (IL-6), N-terminal pro natriuretic peptide (NT-proBNP), and serum ferritin were measured using a Roche Cobas 8000 analyzer. D-dimer was determined with a Siemens BCS-XP instrument using the Siemens D-Dimer Innovance reagent. Anti-S1/S2 protein SARS-CoV-2 immunoglobulin gamma (IgG) were quantified with LIAISON chemoluminescence assay (DiaSorin, Italy), expressed as binding antibody units (BAU, conversion factor = 5.7) and stratified by quartiles (Ferrari et al., 2021).
Low-dose (100 kVp tube potential) craniocaudal CT scans of the chest were acquired without iodine contrast and without ECG gating on a 128-slice multidetector CT (128 × 0.6 mm collimation, 1.1 spiral pitch factor, SOMATOM Definition Flash, Siemens Healthineers, Erlangen, Germany). In case of clinically suspected pulmonary embolism, CT scans were performed with a contrast agent. Axial reconstructions were done with 1 mm slices. CT scans were evaluated for ground-glass opacities, consolidations, bronchial dilation, and reticulations as defined by the Fleischner Society. Lung findings were graded with a semi-quantitative CT severity score (0–25 points) (Sonnweber et al., 2021).
Impaired LF was defined as (1) forced vital capacity (FVC) < 80% or (2) forced expiratory volume in 1 s (FEV1) < 80%, or (3) FEV1:FVC < 70% or (4) total lung capacity (TLC) < 80% or (5) diffusing capacity of carbon monoxide (DLCO) < 80% predicted.
Statistical analysis
Statistical analyses were performed with R version 4.0.5 (Figure 1). Data transformation and visualization were accomplished by
Variable overlap, kinetics, and risk modeling
Overlap between the 180-day follow-up outcome features was assessed by analysis of quasi-proportional Venn plots (package
Cluster analysis
Clustering of non-CT and non-LF binary clinical features (Appendix 1—table 1) was accomplished with PAM algorithm (partitioning around medoids, package
Machine learning
ML classifiers C5.0 (package
Pulmonary recovery assessment app
Participant clustering and ML classifiers trained in the CovILD cohort were implemented in an open-source online pulmonary assessment R shiny app (https://im2-ibk.shinyapps.io/CovILD/; code: https://github.com/PiotrTymoszuk/COVILD-recovery-assessment-app). Prediction of the cluster assignment based on the user-provided patient data is done by the kNN label propagation algorithm (Sahanic et al., 2021; Leng et al., 2013).
Results
Patient characteristics
The CovILD study participants (n = 145) were predominantly male (57.8%), age ranging between 19 and 87 years. 77.2% of participants displayed preexisting comorbidity, predominantly cardiovascular and metabolic disease. The cohort included mild (outpatient care, 24.8%), moderate (hospitalization without oxygen supply, 25.5%), severe (hospitalization with oxygen supply, 27.6%), and critical (intensive care unit [ICU] treatment, 22.1%) cases of acute COVID-19 (Table 1). The majority of hospitalized participants received anti-infectives during acute COVID-19, anticoagulative, and/or antiplatelet treatment introduced primarily in the ventilated patients. Systemic steroid administration was initiated at the discretion of the physician beginning from week 4 after diagnosis (Table 2).
Clinical recovery after COVID-19
Most patients, irrespective of the acute COVID-19 severity, showed a significant resolution of disease symptoms over time (Figure 1, Figure 2A). Persistent complaints at the 6-month follow-up were reported by 49% of the study subjects (Table 3), with self-reported impaired physical performance (34.7%), sleep disorders (27.1%), and exertional dyspnea (22.8%) as leading manifestations. The frequency of all investigated symptoms declined significantly, even though the pace of their resolution was remarkably slower in the late (100- and 180-day follow-ups) than in the early recovery phase (acute COVID-19 till 60-day follow-up) (Figure 2B).
Figure 1.
Study inclusion flow diagram and analysis scheme.
Figure 2.
Kinetic of recovery from COVID-19 symptoms.
Recovery from any COVID-19 symptoms was investigated by mixed-effect logistic modeling (random effect: individual; fixed effect: time). Significance was determined by the likelihood ratio test corrected for multiple testing with the Benjamini–Hochberg method, and p-values and the numbers of complete observations are indicated in the plots. (A) Frequencies of individuals with any symptoms in the study cohort stratified by acute COVID-19 severity. (B) Frequencies of participants with particular symptoms. imp.: impaired.
Impaired LF was observed in 33.6% of the participants at the 6-month follow-up (Table 3). Except for the critical COVID-19 survivors (60 days: 66.7%; 180 days post-COVID-19: 50%), no significant reduction in the frequency of LF impairment over time was observed (Figure 3). At the 6-month follow-up, structural lung abnormalities were found in 48.5% of patients and moderate-to-severe radiological lung alterations (CT severity score > 5) were present in 19.4% of participants (Table 3). The majority of the participants with impaired LF displayed radiological lung findings. However, a substantial fraction of CT abnormalities, especially mild ones, were accompanied neither by persistent symptoms nor by LF deficits (Figure 3—figure supplement 1, Figure 3—figure supplement 2, Figure 3—figure supplement 3A).
Figure 3.
Kinetic of pulmonary recovery.
Recovery from any lung computed tomography (CT) abnormalities, moderate-to-severe lung CT abnormalities (severity score > 5), and recovery from functional lung impairment were investigated in the participants stratified by acute COVID-19 severity by mixed-effect logistic modeling (random effect: individual; fixed effect: time). Significance was determined by the likelihood ratio test corrected for multiple testing with the Benjamini–Hochberg method. Frequencies of the given abnormality at the indicated time points are presented, and p-values and the numbers of complete observations are indicated in the plots.
Figure 3—figure supplement 1.
Co-occurrence of lung computed tomography (CT) abnormalities, functional lung impairment, and any persistent symptoms.
Numbers and percentages of the study participants with any persistent symptoms, functional lung impairment, or lung CT abnormalities at the consecutive follow-up visits presented in quasi-proportional Venn diagrams. The numbers of participants with CT abnormalities, lung function (LF) impairment, and persistent symptoms are indicated in the diagrams, and the numbers of complete observations are shown under the plots.
Figure 3—figure supplement 2.
Co-occurrence of moderate-to-severe lung computed tomography (CT) abnormalities, functional lung impairment, and any persistent symptoms.
Numbers and percentages of the study participants with any persistent symptoms, functional lung impairment, or moderate-to-severe lung CT abnormalities (severity score > 5) at the consecutive follow-up visits presented in quasi-proportional Venn diagrams. The numbers of participants with CT abnormalities, lung function (LF) impairment, and persistent symptoms are indicated in the diagrams, and the numbers of complete observations are shown under the plots.
Figure 3—figure supplement 3.
Frequency of mild and moderate-to-severe lung computed tomography (CT) abnormalities.
Prognostic value of functional lung impairment and persistent symptoms for prediction of radiological lung abnormalities. (A) Relevance of functional lung impairment and persistent COVID-19 symptoms at predicting any lung CT abnormalities and moderate-to-severe lung CT abnormalities (severity score > 5) at the consecutive follow-up visits. The concordance of the outcome variables was determined by Cohen’s κ coefficient. Statistical significance (κ / = 0) was assessed by two-tailed
The frequency, scoring, and recovery of CT lung findings were related to the severity of acute infection. Pulmonary lesions scored > 5 CT severity points at the 180-day follow-up were most frequent in the individuals with severe and critical acute COVID-19 (Figure 3—figure supplement 3). Notably, the hospitalized group with oxygen therapy demonstrated the fastest recovery kinetics. As for the symptom resolution, LF and CT lung recovery decelerated in the late phase of COVID-19 convalescence (Figure 3).
Risk factors of protracted recovery
To identify risk factors of delayed recovery at the 6-month follow-up, we screened a set of 52 binary clinical parameters (Appendix 1—table 1) recorded during acute COVID-19 and at the 60-day visit by univariate modeling (Appendix 1—table 2). By this means, no significant correlates for long-term symptom persistence were identified. Risk factors and readouts of severe and critical COVID-19 including multimorbidity, malignancy, male sex, prolonged hospitalization, ICU stay, and immunosuppressive therapy were significantly associated with persistent CT (Figure 4) and LF abnormalities (Figure 5). Persistently elevated inflammatory markers, IL-6 (>7 ng/L) and CRP (>0.5 mg/L), were strong unfavorable risk factors for incomplete radiological and functional pulmonary recovery. Additionally, the biochemical readout of microvascular inflammation, D-dimer (>500 pg/mL) was significantly linked to LF deficits. Low serum anti-S1/S2 IgG titers at the 60-day follow-up and ambulatory acute COVID-19 correlated with an improved pulmonary recovery (Figures 4 and 5).
Figure 4.
Risk factors of persistent radiological lung abnormalities.
Association of 52 binary explanatory variables (Appendix 1—table 1) with the presence of any lung computed tomography (CT) abnormalities (A) or moderate-to-severe lung CT abnormalities (severity score > 5) (B) at the 180-day follow-up visit was investigated with a series of univariate logistic models (Appendix 1—table 2). Odds ratio (OR) significance was determined by Wald Z test and corrected for multiple testing with the Benjamini–Hochberg method. ORs with 95% confidence intervals for significant favorable and unfavorable factors are presented in forest plots. Model baseline (ref) and numbers of complete observations are presented in the plot axis text. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; ICU: intensive care unit.
Figure 5.
Risk factors of persistent functional lung impairment.
Association of 52 binary explanatory variables (Appendix 1—table 1) with the presence of functional lung impairment at the 180-day follow-up visit was investigated with a series of univariate logistic models (Appendix 1—table 2). Odds ratio (OR) significance was determined by Wald Z test and corrected for multiple testing with the Benjamini–Hochberg method. ORs with 95% confidence intervals for the significant favorable and unfavorable factors are presented in a forest plot. Model baseline (ref) and n numbers of complete observations are presented in the plot axis text. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; CKD: chronic kidney disease.
Clusters of clinical features linked to persistent symptoms and lung abnormalities
Employing the unsupervised PAM algorithm (Amato et al., 2019), three clusters of co-occurring non-CT and non-LF clinical features of acute COVID-19 and early convalescence (Appendix 1—table 1) were identified (Figure 6—figure supplement 1, Appendix 1—table 3): (1) cluster 1 with male sex, hypertension, and cardiovascular and metabolic comorbidity; (2) cluster 2, including characteristics of acute COVID-19 severity and inflammatory markers; and (3) cluster 3 consisting of acute and persistent COVID-19 symptoms (Figure 6—figure supplement 2, Appendix 1—table 3).
The 6-month follow-up outcome variables were incorporated in the cluster structure using kNN prediction (Leng et al., 2013). Long-term symptom persistence was associated with acute and long-lasting COVID-19 symptoms in cluster 3, whereas pulmonary outcome parameters were grouped with cluster 2 features (Figure 6A, Figure 6—figure supplement 2, Appendix 1—table 3). Preexisting comorbidities such as malignancy, kidney, lung and gastrointestinal disease, obesity, and diabetes were found the closest cluster neighbors of mild CT abnormalities (severity score ≤ 5). Moderate-to-severe structural alterations (severity score > 5) and LF deficits were, in turn, tightly linked to markers of protracted systemic inflammation (IL-6, CRP, anemia of inflammation) (Sonnweber et al., 2020; Figure 6B).
Figure 6.
Association of incomplete symptom, lung function, and radiological lung recovery with demographic and clinical parameters of acute COVID-19 and early recovery.
Clustering of 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the early 60-day follow-up visit (Appendix 1—table 1) was investigated by partitioning around medoids (PAM) algorithm with simple matching distance (SMD) dissimilarity measure (Figure 6—figure supplement 1, Appendix 1—table 3). The cluster assignment for the outcome variables at the 180-day follow-up visit (persistent symptoms, functional lung impairment, mild lung CT abnormalities [severity score ≤ 5] and moderate-to-severe lung CT abnormalities [severity score > 5]) was predicted by k-nearest neighbor (kNN) label propagation procedure. Numbers of complete observations and numbers of features in the clusters are indicated in (A). (A) Cluster assignment of the outcome variables (diamonds) presented in the plot of principal component (PC) scores. The first two major PCs are displayed. The explanatory variables are visualized as points. Percentages of the data set variance associated with the PC are presented in the plot axes. (B) Five nearest neighbors (lowest SMD) of the outcome variables presented in radial plots. Font size, point radius, and color code for SMD values. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; GITD: gastrointestinal disease; CKD: chronic kidney disease; ICU: intensive care unit; COPD: chronic obstructive pulmonary disease.
Figure 6—figure supplement 1.
Study feature clustering algorithm.
Clustering of 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the early 60-day follow-up visit (Appendix 1—table 1). (A, B) Comparison of ‘explained’ variances (between-cluster to total sum-of-squares ratio) (A) and cluster stability (mean classification error in 20-fold cross-validation) (B) in clustering of the data set with several algorithms with k = 3 centers/branches (algorithms: K-means; PAM: partitioning around medoids; HCl Ward.D2: hierarchical clustering with Ward.D2 method; distances: SMD: simple matching distance; Jaccard, Dice, and Cosine). (C, D) The optimal number of the feature clusters in clustering with the optimally performing PAM algorithm with SMD dissimilarity measure was determined by the bend of the total within-cluster sum-of-squares curve (C) and confirmed by good stability (low mean classification error) in 20-fold cross-validation (D).
Figure 6—figure supplement 2.
Semi-supervised clustering of mild and moderate-to-severe lung computed tomography (CT) abnormalities, functional lung impairment, and persistent symptoms at the 180-day follow-up with parameters of acute COVID-19 and early convalescence.
Clusters of 52 non-CT and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1—table 1) were defined by the optimally performing partitioning around medoids (PAM) algorithm and simple matching distance (SMD) dissimilarity measure (Figure 6A, Figure 6—figure supplement 1, Appendix 1—table 3). The cluster assignment for the outcome variables at the 180-day follow-up visit (persistent symptoms, functional lung impairment, mild lung CT abnormalities [severity score ≤ 5], and moderate-to-severe lung CT abnormalities [severity score > 5]) was predicted by k-nearest neighbor (kNN) label propagation procedure. SMD between the features and their cluster assignments are shown in a heat map. The numbers of features in the clusters and the total number of observations are indicated under the plot. CVD: cardiovascular disease; Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; GI: gastrointestinal; PD: pulmonary disease; GITD: gastrointestinal disease; ICU: intensive care unit; COPD: chronic obstructive pulmonary disease; CKD: chronic kidney disease.
Risk stratification for perturbed pulmonary recovery by unsupervised clustering
Next, we tested whether subsets of patients at risk of an incomplete 6-month recovery may be defined by a similar clustering procedure employing exclusively non-CT and non-LF clinical variables (Appendix 1—table 1). Applying a combined SOM – hierarchical clustering approach, three clusters of the study participants were identified (Figure 7, Figure 7—figure supplement 1; Vesanto and Alhoniemi, 2000; Kohonen, 1995). Prolonged hospitalization, anti-infective therapy, overweight or obesity, pain during acute COVID-19, and low anti-S1/S2 titers at the 60-day follow-up were found the most influential clustering features (Figure 7—figure supplement 2; Breiman, 2001). The patient subsets identified by the SOM approach differed significantly in frequency of radiological lung abnormalities and substantially, yet not significantly, in the frequency of LF impairment at the 180-day follow-up. In particular, most of the individuals assigned to the largest, low-risk (LR) subset were CT and LF abnormality-free. The frequency and severity of radiological pulmonary findings were elevated in the smallest intermediate-risk subset (IR) and peaked in the high-risk (HR) group (Figure 8A). Despite a comparable frequency of long-term symptoms between the LR, IR, and HR subsets (Figure 8A), the HR collective showed the lowest prevalence of dyspnea, cough, night sweating, pain, gastrointestinal manifestations, and complete absence of hyposmia at the 180-day follow-up (Figure 8B). Although the LR subset primarily comprised mild COVID-19 cases and the HR subset ICU survivors, the cluster assignment (IR vs. LR, HR vs. LR) remained an independent correlate of persistent CT and LF abnormalities after adjustment for the acute COVID-19 severity (Figure 8—figure supplement 1).
Figure 7.
Clustering of the study participants by non-lung function and non-computed tomography (non-CT) clinical features.
Study participants (n = 133 with the complete variable set) were clustered with respect to 52 non-CT and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1—table 1) using a combined self-organizing map (SOM: simple matching distance) and hierarchical clustering (Ward.D2 method, Euclidean distance) procedure (Figure 7—figure supplement 1). The numbers of participants assigned to low-risk (LR), intermediate-risk (IR), and high-risk (HR) clusters are indicated in (A). (A) Cluster assignment of the study participants in the plot of principal component (PC) scores. The first two major PCs are displayed. Percentages of the data set variance associated with the PC are presented in the plot axes. (B) Presence of the most influential clustering features (Figure 7—figure supplement 2) in the participant clusters presented as a heat map. Cluster #1, #2, and #3 refer to the feature clusters defined in Figure 6. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; GITD: gastrointestinal disease; CKD: chronic kidney disease; CVD: cardiovascular disease; GI: gastrointestinal; PD: pulmonary disease.
Figure 7—figure supplement 1.
Study participant clustering algorithm.
Clustering of the study participants (n = 133 with the complete variable set) with respect to 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1—table 1). The procedure involved clustering of the observations with self-organizing maps (SOM, 4 × 4 hexagonal grid, distances: SMD: simple matching distance, Jaccard, Dice, or Cosine) followed by clustering of the SOM nodes (algorithms: HCl ward.D2: hierarchical clustering with Ward.D2 method; K-means; PAM: partitioning around medoids; distance: Euclidean). Different combinations of observation dissimilarity measures and SOM node clustering algorithms were tested in the search for the optimal clustering algorithm. (A, B) Comparison of ‘explained’ variances (between-cluster to total sum-of-squares ratio) (A) and cluster stability (mean classification error in 20-fold cross-validation) (B) in clustering of the data set with different observation distance measures and SOM node clustering algorithms. (C) Training of the SOM algorithm, mean distance to the winning un as a function of lgorithm iterations is presented. Note the mean distance plateau indicative of the algorithm convergence (D–F) The optimal number of the SOM node clusters in clustering with the optimally performing SOM HCl algorithm with SMD observation dissimilarity measure. The optimal cluster number was determined by the bend of the total within-cluster sum-of-squares curve (D) and confirmed by visual inspection of the HCl dendrogram (E) and good stability (low mean classification error) in 20-fold cross-validation (F).
Figure 7—figure supplement 2.
Impact of specific variables on the quality of participant clustering.
The clusters of participants clusters were defined with the optimally performing self-organizing map (SOM)/HCl algorithm with simple matching distance (SMD) observation dissimilarity measure as presented in Figure 7 and Figure 7—figure supplement 1. The impact of a particular clustering variable was determined by comparing the ‘explained’ clustering variance (between-cluster to total sum-of-squares ratio) between the initial cluster structure and the structure with random resampling of the variable. Differences in the clustering variances for the most influential clustering variables (Δ clustering variance > 0) are presented in the plot. Q1, Q3: first, third quartile of anti-S1/S2 IgG titer; CKD: chronic kidney disease; GI: gastrointestinal; CVD: cardiovascular disease; PD: pulmonary disease; GITD: gastrointestinal disease.
Figure 8.
Frequency of persistent radiological lung abnormalities, functional lung impairment, and symptoms in the participant clusters.
The clusters of study participants were defined by non-lung function and non-computed tomography (non-CT) features as presented in Figure 7. Frequencies of outcome variables at the 180-day follow-up visit (mild [severity score ≤ 5], moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) were compared between the low-risk (LR), intermediate-risk (IR), and high-risk (HR) participant clusters by χ2 test corrected for multiple testing with the Benjamini–Hochberg method. p-Values and numbers of participants assigned to the clusters are indicated in the plots. (A) Frequencies of the outcome features in the participant clusters. (B) Frequencies of specific symptoms in the participant clusters.
Figure 8—figure supplement 1.
Risk of radiological lung abnormalities at the 180-day follow-up in the participant clusters.
The clusters of participants were defined by non-lung function and non-computed tomography (non-CT) clinical features of acute COVID-19 and early convalescence (60-day follow-up visit, Appendix 1—table 1) with the optimally performing HCl algorithm with simple matching distance (SMD) observation dissimilarity measure as presented in Figure 7 and Figure 7—figure supplement 1. (A) Distribution of mild, moderate, severe, and critical acute COVID-19 cases in the participant clusters. Significance of the distribution differences was assessed with χ2 test. The numbers of participants assigned to the clusters are indicated under the plot. (B) Association of the participant cluster assignment (LR: low-risk; IR: intermediate-risk; HR: high-risk cluster) with the risk of any lung CT abnormalities and moderate-to-severe lung CT abnormalities (severity score > 5) at the 180-day follow-up visit was investigated by logistic modeling with and without inclusion of the acute COVID-19 severity effect (severity-adjusted). Odds ratio (OR) significance was determined by Wald Z test and corrected for multiple testing with the Benjamini–Hochberg method. ORs with 95% confidence intervals are presented in forest plots. Numbers of complete observations, outcome events, participants in the clusters, and the acute COVID-19 severity subsets are indicated under the plot.
Prediction of persistent symptoms and pulmonary abnormalities by machine learning
Finally, we investigated if the 6-month follow-up outcome may be predicted by ML classifiers trained with a set of non-CT and non-LF variables recorded during acute COVID-19 and at the 60-day follow-up (Appendix 1—table 1). To this end, five technically unrelated ML classifiers were tested (Appendix 1—table 4; Kuhn, 2008): C5.0 (Quinlan, 1993), random forests (RF) (Breiman, 2001), support vector machines with radial kernel (SVM-R) (Weston and Watkins, 1998), shallow neural network (Nnet) (Ripley, 2014), and elastic net generalized linear regression (glmNet) (Friedman et al., 2010). In addition, the single classifiers with varying outcome-specific accuracy (Figure 9—figure supplement 1) were bundled into ensembles by the elastic net procedure (Figure 9—figure supplement 2, Appendix 1—table 4; Kuhn, 2008; Deane-Mayer and Knowles, 2019). Finally, the classifier and ensemble performance was investigated in the training cohort and 20-fold CV by ROC (Appendix 1—table 5).
All tested ML algorithms and ensembles demonstrated good accuracy (area under the curve [AUC] > 0.78) and sensitivity (>0.84) at predicting any lung CT abnormalities at the 6-month follow-up in the study cohort serving as a training data set. Their efficiency in CV was moderate (AUC: 0.69–0.81; sensitivity: 0.69–0.78) (Figure 9, Figure 9—figure supplement 3, Appendix 1—table 5). In turn, moderate-to-severe structural lung findings were recognized with markedly lower sensitivity both in the training data set (>0.43) and the CV (0.39–0.48). Even though impaired LF and persistent symptoms were common at the 6-month follow-up in the training data set (Figures 2 and 3), nearly half of the cases were not identified by any of the tested ML algorithms and their ensembles in the CV setting (Figure 9, Figure 9—figure supplement 3, Appendix 1—table 5). The sensitivity of the ensembles and single classifiers at predicting CT and LF abnormalities was substantially better in severe and critical COVID-19 survivors than in ambulatory and moderate cases (Figure 10, Appendix 1—table 6).
Figure 9.
Prediction of persistent radiological lung abnormalities, functional lung impairment, and symptoms by machine learning algorithms.
Single machine learning classifiers (C5.0; RF: random forests; SVM-R: support vector machines with radial kernel; NNet: neural network; glmNet: elastic net) and their ensemble (Ens) were trained in the cohort data set with 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1—table 1) for predicting outcome variables at the 180-day follow-up visit (any lung CT abnormalities, moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) (Appendix 1—table 4). The prediction accuracy was verified by repeated 20-fold cross-validation (five repeats). Receiver-operating characteristics (ROCs) of the algorithms in the cross-validation are presented: area under the curve (AUC), sensitivity (Sens), and specificity (Spec) (Appendix 1—table 5). The numbers of complete observations and outcome events are indicated under the plots.
Figure 9—figure supplement 1.
Correlation of the machine learning algorithm prediction accuracy.
Machine learning classifiers (C5.0; RF: random forests; SVM-R: support vector machines with radial kernel; NNet: neural network; glmNet: elastic net) were trained in the cohort data set with 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the early 60-day follow-up visit (Appendix 1—table 1) for predicting outcome variables at the 180-day follow-up visit (any lung CT abnormalities, moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) (Figure 9, Appendix 1—table 4). The prediction accuracy was verified by repeated 20-fold cross-validation (five repeats). Pearson’s correlation coefficients of the classifier prediction accuracy in the cross-validation are presented as heat maps. Numbers of complete observations and outcome events are indicated under the plots.
Figure 9—figure supplement 2.
Machine learning model ensembles.
Single machine learning classifiers (C5.0; RF: random forests; SVM-R: support vector machines with radial kernel; NNet: neural network; glmNet: elastic net) were trained as shown in Figure 9. The model ensembles based on the single classifiers were constructed with the glmNet procedure (Appendix 1—table 4). glmNet regression coefficients (β) are presented in the plots. Point and text color correspond to the β value. Numbers of complete observations and outcome events are indicated under the plots.
Figure 9—figure supplement 3.
Prediction of persistent radiological lung abnormalities, functional lung impairment, and symptoms by machine learning algorithms in the training data sets.
Single machine learning classifiers (C5.0; RF: random forests; SVM-R: support vector machines with radial kernel; NNet: neural network; glmNet: elastic net) and their ensembles were trained as shown in Figure 9. Performance of the classifiers in the training data sets was investigated by receiver-operating characteristic (ROC) of the algorithms (AUC: area under the curve; Sens: sensitivity; Spec: specificity, Appendix 1—table 5). Numbers of complete observations and outcome events are indicated under the plots.
Figure 9—figure supplement 4.
Variable importance statistics for prediction of lung computed tomography (CT) abnormalities at the 180-day follow-up by machine learning classifiers.
C5.0, random forests (RF), and elastic net (glmNet) classifiers were trained as presented in Figure 9 for prediction of any lung CT abnormalities at the 180-day follow-up visit. Variable importance measures (C5.0: % attribute/variable usage in the tree model (A); RF: difference in Gini index (B); glmNet: absolute value of the regression coefficient β (C)) for the 10 most influential explanatory variables are presented. CKD: chronic kidney disease; Q1, Q4: first, fourth quartile of anti-S1/S2 IgG titer; PD: pulmonary disease; CKD: chronic kidney disease.
Figure 9—figure supplement 5.
Variable importance statistics for prediction of moderate-to-severe lung computed tomography (CT) abnormalities at the 180-day follow-up by machine learning classifiers.
C5.0, random forests (RF), and elastic net (glmNet) classifiers were trained as presented in Figure 9 for prediction of moderate-to-severe lung CT abnormalities (severity score > 5) at the 180-day follow-up visit. Variable importance measures (C5.0: % attribute/variable usage in the tree model (A); RF: difference in Gini index (B); glmNet: absolute value of the regression coefficient β (C)) for the 10 most influential explanatory variables are presented. PD: pulmonary disease; GITD: gastrointestinal disease; Q1, Q2, Q4: first, second, fourth quartile of anti-S1/S2 IgG titer.
Figure 9—figure supplement 6.
Variable importance statistics for prediction of functional lung impairment at the 180-day follow-up by machine learning classifiers.
C5.0, random forests (RF), and elastic net (glmNet) classifiers were trained as presented in Figure 9 for prediction of functional lung impairment at the 180-day follow-up visit. Variable importance measures (C5.0: % attribute/variable usage in the tree model (A); RF: difference in Gini index (B); glmNet: absolute value of the regression coefficient β (C)) for the 10 most influential explanatory variables are presented. CKD: chronic kidney disease; Q1, Q2: first. second quartile of anti-S1/S2 IgG titer.
Figure 9—figure supplement 7.
Variable importance statistics for prediction of persistent symptoms at the 180-day follow-up by machine learning classifiers.
C5.0, random forests (RF), and elastic net (glmNet) classifiers were trained as presented in Figure 9 for prediction of persistent symptoms at the 180-day follow-up visit. Variable importance measures (C5.0: % attribute/variable usage in the tree model (A); RF: difference in Gini index (B); glmNet: absolute value of the regression coefficient β (C)) for the 10 most influential explanatory variables are presented. CVD: cardiovascular disease; GITD: gastrointestinal disease; COPD: chronic obstructive lung disease.
Figure 10.
Performance of the machine learning ensemble classifier in mild-to-moderate and severe-to-critical COVID-19 convalescents.
The machine learning classifier ensemble (Ens) was developed as presented in Figure 9. Its performance at predicting outcome variables at the 180-day follow-up visit (any computed tomography [CT] lung abnormalities, moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) in the entire cohort, mild-to-moderate (outpatient or hospitalized without oxygen), and severe-to-critical COVID-19 convalescents (oxygen therapy or ICU) in repeated 20-fold cross-validation (five repeats) was assessed by receiver-operating characteristic (ROC) (Appendix 1—table 6). ROC curves and statistics (AUC: area under the curve; Se: sensitivity; Sp: specificity) in the cross-validation are shown. Numbers of complete observations and outcome events are indicated in the plots.
The most important explanatory variables for pulmonary abnormalities by three unrelated classifiers (C5.0, RF, and glmNet) included preexisting malignancy, multimorbidity, markers of systemic inflammation (IL-6 and CRP), and anti-S1/S2 antibody levels at the 60-day follow-up (Figure 9—figure supplement 4, Figure 9—figure supplement 5, Figure 9—figure supplement 6). The highly influential parameters at prediction of symptoms at the 180-day follow-up encompassed symptom presence at the 60-day follow-up, as well as obesity and dyspnea during acute COVID-19 (Figure 9—figure supplement 7).
Discussion
Herein, we prospectively evaluated trajectories of COVID-19 recovery in an observational cohort enrolled in the Austrian CovILD study (Sonnweber et al., 2021). Despite the resolution of symptoms and pulmonary abnormalities at the 6-month follow-up in a large fraction of the study participants, the recovery pace was substantially slower in the late convalescence when compared with the first three months after diagnosis (Sonnweber et al., 2021; Huang et al., 2021a). Persistent symptoms and CT findings were detected in more than 40% and reduced LF in approximately one-third of the cohort, which is in line with recovery kinetics and signs of lung lesion chronicity reported by others (Caruso et al., 2021; Huang et al., 2021b; Huang et al., 2021a; Faverio et al., 2021; Hellemons et al., 2021; Zhou et al., 2021). By comparison, similar protracted pulmonary recovery was reported for SARS (Hui et al., 2005; Ng et al., 2004; Ngai et al., 2010; Lam et al., 2009) and non-COVID-19 acute respiratory distress syndrome (Wilcox et al., 2013; Masclans et al., 2011). Of note, treatment approaches for hospitalized patients in our cohorts and similar cohorts recruited at the pandemic onset in early 2020 (Caruso et al., 2021; Huang et al., 2021b; Huang et al., 2021a; Faverio et al., 2021; Hellemons et al., 2021) differ significantly from the current standard of care for acute COVID-19, which includes early systemic steroid use and antiviral and various immunomodulatory medications. How improved standardized therapy and anti-SARS-CoV-2 vaccination affect the clinical and pulmonary recovery needs to be investigated.
In roughly half of our study participants with abnormal lung CT findings, and especially in those with low-grade structural abnormalities, no overt LF impairment at follow-up was discerned. Still, even subclinical lung alterations may bear the potential for clinically relevant progression of interstitial lung disease (Suliman et al., 2015; Hatabu et al., 2020) requiring systematic CT and LF monitoring. Conversely, symptom persistence was weakly associated with incomplete functional or structural pulmonary recovery.
Since PASC are found in as many as 10% of COVID-19 patients (Sahanic et al., 2021; Venkatesan, 2021; Sudre et al., 2021b), robust, resource-saving tools assessing the individual risk of pulmonary complications are urgently needed (Shah et al., 2021; Raghu and Wilson, 2020). Covariates and characteristics of severe acute COVID-19 such as male sex, age, and preexisting comorbidities, hospitalization, ventilation, and ICU stay were proposed as the risk factors of persistent pulmonary impairment (Sonnweber et al., 2021; Caruso et al., 2021; Huang et al., 2021a; Faverio et al., 2021; Raghu and Wilson, 2020). However, their applicability in predicting complications of pulmonary recovery from mild or moderate COVID-19 is limited. Our results of univariate modeling, clustering, and ML prediction point towards a distinct long-term pulmonary risk phenotype that manifests during acute COVID-19 and early recovery and whose central components are protracted systemic (IL-6, CRP, anemia of inflammation) and microvascular inflammation (D-dimer), and strong humoral response (anti-S1/S2 IgG) demographic risk factors and comorbidities (Sonnweber et al., 2020). Hence, consecutive monitoring of systemic inflammatory parameters analogous to concepts of interstitial lung disease in autoimmune disorders (Khanna et al., 2020) and anti-S1/S2 antibody levels may improve identification of the individuals at risk of chronic pulmonary damage irrespective of the acute COVID-19 severity.
Clustering and ML have been employed for deep phenotyping and predicting acute and post-acute COVID-19 outcomes in multivariable data sets (Sahanic et al., 2021; Sudre et al., 2021a; Estiri et al., 2021; Demichev et al., 2021; Benito-León et al., 2021). We demonstrate that subsets of COVID-19 patients that significantly differ in the risk for long-term CT abnormalities may be defined by an easily accessible clinical parameter set available at the early post-COVID-19 assessment. This approach did not involve any CT or LF variables. Furthermore, the cluster classification correlated with the risk of long-term pulmonary abnormalities independently of the acute COVID-19 severity. Thus, these characteristics provide a useful tool for broad screening of convalescent populations, including individuals who experienced mild or moderate COVID-19.
We show that technically unrelated ML classifiers and their ensemble trained without CT and LF explanatory variables can predict lung CT findings independently of their grading at the 6-month follow-up with good specificity and sensitivity in the training collective and CV. By contrast, the more specific prediction of moderate-to-severe lung CT or risk estimation for LF deficits demonstrated a limited sensitivity. For the moderate-to-severe CT abnormalities, this can be primarily traced back to their low frequency resulting in a suboptimal classifier training, especially in CV. A substantial fraction of the participants (20.7%, n = 30) suffered from a preexisting respiratory condition (pulmonary disease, asthma, or COPD) likely paralleled by LF reduction, which possibly confounded the prediction of the post-COVID-19 LF deficits both by clustering and ML. Accumulating evidence suggests that post-acute COVID-19 symptoms are highly heterogeneous conditions with multiorgan, neurocognitive, and psychological manifestations (Sahanic et al., 2021; Evans et al., 2021; Davis et al., 2021), which may differ in risk factor constellations. This could explain why univariate modeling, clustering, and ML failed to estimate persistent symptom risk in our small study cohort. In general, the ML prediction quality may greatly benefit from a larger training data set and inclusion of additional explanatory variables such as cellular readouts of inflammation, in-depth medication, and broader acute symptom data. Nevertheless, the herein described cluster- and ML classifiers represent resource-effective tools that may assist in the screening of medical record data and identification of COVID-19 patients requiring systematic CT and LF monitoring. To facilitate the identification of patients at risk for protracted respiratory recovery and enable validation in an external collective, we implemented the clustering and prediction procedures in an open-source risk assessment application (https://im2-ibk.shinyapps.io/CovILD/).
Our study bears limitations primarily concerning the low sample size and the cross-sectional character of the trial. Because of the impaired availability of the patients and the prolonged inpatient rehabiliation, the 60- and 100-day follow-up visits in part showed a temporal overlap that may have impacted the accuracy of the longitudinal data. Missingness of the consecutive outcome variable record and the participant dropout, particularly of mild and moderate COVID-19 cases, may have also potentially confounded the participant clustering results and ML risk estimation for CT abnormalities and LF impairment since prolonged hospitalization was found to be a crucial cluster-defining and influential explanatory feature. Additionally, even though the reproducibility of the risk assessment algorithms was partially addressed by CV, cluster and ML classifiers call for verification in a larger, independent multicenter collective of COVID-19 convalescents.
In summary, in our CovILD study cohort we found a high frequency of CT and LF abnormalities and persistent symptoms at the 6-month follow-up, and a flattened recovery kinetics after 3 months post-COVID-19. Systematic risk modeling reveled a set of clinical variables linked to protracted pulmonary recovery apart from the severity of acute infection such as inflammatory markers, anti-S1/S2 IgG levels, multimorbidity, and male sex. We demonstrate that clustering and ML classifiers may help to identify individuals at risk of persistent lung lesions and to relocate medical resources to prevent long-term disability.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022, Sonnweber et al. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Background:
The optimal procedures to prevent, identify, monitor, and treat long-term pulmonary sequelae of COVID-19 are elusive. Here, we characterized the kinetics of respiratory and symptom recovery following COVID-19.
Methods:
We conducted a longitudinal, multicenter observational study in ambulatory and hospitalized COVID-19 patients recruited in early 2020 (n = 145). Pulmonary computed tomography (CT) and lung function (LF) readouts, symptom prevalence, and clinical and laboratory parameters were collected during acute COVID-19 and at 60, 100, and 180 days follow-up visits. Recovery kinetics and risk factors were investigated by logistic regression. Classification of clinical features and participants was accomplished by unsupervised and semi-supervised multiparameter clustering and machine learning.
Results:
At the 6-month follow-up, 49% of participants reported persistent symptoms. The frequency of structural lung CT abnormalities ranged from 18% in the mild outpatient cases to 76% in the intensive care unit (ICU) convalescents. Prevalence of impaired LF ranged from 14% in the mild outpatient cases to 50% in the ICU survivors. Incomplete radiological lung recovery was associated with increased anti-S1/S2 antibody titer, IL-6, and CRP levels at the early follow-up. We demonstrated that the risk of perturbed pulmonary recovery could be robustly estimated at early follow-up by clustering and machine learning classifiers employing solely non-CT and non-LF parameters.
Conclusions:
The severity of acute COVID-19 and protracted systemic inflammation is strongly linked to persistent structural and functional lung abnormality. Automated screening of multiparameter health record data may assist in the prediction of incomplete pulmonary recovery and optimize COVID-19 follow-up management.
Funding:
The State of Tyrol (GZ 71934), Boehringer Ingelheim/Investigator initiated study (IIS 1199-0424).
Clinical trial number:
ClinicalTrials.gov: NCT04416100
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer