INTRODUCTION
Cystic fibrosis (CF) is a genetic disease which results in excessive mucus production that reduces lung function and impedes the release of pancreatic enzymes (1, 2). While digestive problems are highly prevalent among CF patients (3), approximately 80 to 95% of CF deaths are attributable to respiratory failure due to chronic airway infections and associated inflammation (1). The Cystic Fibrosis Foundation (CFF) estimates that approximately 70,000 CF patients are living worldwide and about 1,000 new CF cases are diagnosed in the United States each year (www.cff.org). Following Koch’s postulate (4), the traditional view of CF lung infections has been that specific airway pathogens are responsible for monomicrobial infections (5). CF bacterial pathogens that have been identified from patient sputum samples and commonly studied in vitro using pure culture include
With the advent of culture-independent techniques such as 16S rRNA gene amplicon library sequencing, sputum and bronchoscopy samples from CF patients can be analyzed systematically with respect to the diversity and abundance of bacterial taxa present (7, 8). Numerous studies have shown that CF airway infections are rarely monomicrobial, but rather the CF lung harbors a complex community of bacteria that originate from the mouth, skin, intestine, and the environment (7–10). 16S sequencing can reliably delineate community members down to the genus level, showing that the most common genera in adult CF patient samples are Streptococcus, Pseudomonas, Prevotella, Veillonella, Neisseria, Porphyromonas, and Catonella (7). While the identities and relative abundances of the genera present can be determined by 16S rRNA gene sequencing, different analysis techniques are required to understand the interactions between the multiple bacterial taxa and the CF lung environment, the role of the individual microbes in shaping community composition and behavior, and the impact of community composition on the efficacy of antibiotic treatment regimens. While microbiota cooccurrence networks have provided important insights into interactions between bacterial taxa colonizing the CF lung (11, 12), these methods require species abundance data as inputs and therefore are not fully predictive.
In silico metabolic modeling has emerged as a powerful approach for analyzing complex microbial communities by integrating genome-scale reconstructions of single-species metabolism within mathematical descriptions of metabolically interacting communities (13, 14). Modeled species interactions typically include competition for host-derived nutrients and cross-feeding of secreted by-products such as organic acids, alcohols, and amino acids between species (15, 16). Due to challenges in developing manually curated reconstructions of poorly studied species, including those present in the CF lung, most in silico community models have been restricted to ∼5 microbial species (17–19) and fail to adequately cover the diversity of in vivo communities. This limitation can be overcome in bacterial communities by using semicurated reconstructions developed through computational pipelines such as the ModelSeed (20), AGORA (21), and other methods (22). Given the availability of suitable single-strain metabolic reconstructions, a number of alternative methods have been developed for mathematical formulation and numerical solution of microbial community models (23–26). The recently developed SteadyCom method is particularly notable due to its formulation that ensures proper balancing of metabolites across the species and scalability to large communities (27). A properly formulated community model can yield information that is difficult to ascertain experimentally, including the effects of the host environment on community growth, species abundances, and cross-fed metabolite secretion and uptake rates.
In this paper, we utilized 16S rRNA gene amplicon library sequencing data from three published studies (28–30) to develop a 17-species bacterial community model for predicting species abundances in CF airway communities (Fig. 1). The 16S rRNA gene sequence data covers 75 distinct sputum samples from 46 adult CF patients and captures the heterogeneity of CF polymicrobial infections with respect to taxonomic diversity and the prevalence of pathogens, including Pseudomonas, Streptococcus, Burkholderia, Achromobacter, and
FIG 1
Overview of the community metabolic modeling framework driven by patient microbiota composition data. (A) 16S rRNA gene sequence data for 46 patients averaged across 75 distinct samples for the 72 highest-ranked taxonomic groups (typically genera). (B) 16S rRNA gene sequence data for the 17 highest-ranked taxonomic groups normalized to sum to unity and then averaged across the 75 samples. The error bars represent the variances of the normalized read data. (C) AGORA strain models (21) are selected for 17 species that represent each taxonomic group. (D) Definition of the nutrient environment through specification of the community uptake rate of each extracellular metabolite. (E) Species abundances predicted from a SteadyCom (27) simulation with nominal community uptake rates compared to normalized reads for a random patient sample. (F) Average species abundances predicted from an ensemble of SteadyCom simulations with randomized community uptake rates compared to normalized reads averaged across the patient samples.
RESULTS
Few taxonomic groups dominate the CF airway community samples.
Principal-component analysis (PCA) was performed on the normalized read data of the 75 samples to evaluate sample heterogeneity. The first three principal components (PCs) captured 77.8% of the data variance, with the first PC capturing 57.3% of variance and most heavily weighting the most abundant genera Pseudomonas, Streptococcus, and Prevotella as expected (see Table S1 in the supplemental material). A considerable degree of heterogeneity was evident from a plot of the 75 samples in the coordinates defined by the first three PCs (Fig. 2A). Most striking were the outlier samples from three patients infected with
FIG 2
PCA performed on the normalized read data. (A) PCA performed for all 75 samples with the normalized reads for each taxonomic group plotted using the first three principal components (PCs) that explained 57.3%, 12.3%, and 8.2%, respectively, of the data variance. Samples containing
Because each pathogen infected only a single patient among the 46 included patients, we generated a smaller data set of 67 samples by removing these 8 samples. When PCA was performed on this reduced data set, the first three PCs explained 92.6% of the data variance (Table S2), suggesting substantially reduced heterogeneity compared to the full data set. These three PCs heavily weighted only the four taxonomic groups Pseudomonas, Streptococcus, Prevotella, and Haemophilus, with the first PC representing high Pseudomonas and low Streptococcus, the second PC component representing high Streptococcus and moderate Pseudomonas, and the third PC representing high Haemophilus, low Pseudomonas, and low Streptococcus. Considerable heterogeneity was evident when the 67 samples were plotted using the first two PCs accounting for 84.2% of the variance (Fig. 2B). Here the first PC represented high Pseudomonas, low Streptococcus, moderate Prevotella, and moderate Haemophilus, and the second PC represented low Pseudomonas, high Streptococcus, low Prevotella, and low Haemophilus.
Based on these results, we focused our community modeling efforts on predicting the infrequent dominance of the pathogens
TABLE 1
CF genera analyzeda
Species no. | Species strain name | Avg reads | Sample reads >1% (%) |
---|---|---|---|
1 | 0.447 | 85.3 | |
2 | 0.213 | 88.0 | |
3 | 0.098 | 74.7 | |
4 | 0.029 | 4.0 | |
5 | 0.028 | 22.7 | |
6 | 0.026 | 4.0 | |
7 | 0.026 | 48.0 | |
8 | 0.023 | 26.7 | |
9 | 0.023 | 34.7 | |
10 | 0.016 | 48.0 | |
11 | 0.014 | 2.7 | |
12 | 0.015 | 30.7 | |
13 | 0.012 | 36.0 | |
14 | 0.008 | 18.7 | |
15 | 0.009 | 21.3 | |
16 | 0.006 | 20.0 | |
17 | Ralstonia sp. 5 7 47FAA | 0.004 | 6.7 |
a
Shown is a list of the 17 species/strains included in the CF airway community model, the normalized fractional reads for the associated genera averaged across the 75 samples, and the percentage of samples in which the normalized reads exceeded 1%.
The community model can reproduce dominance of CF pathogens.
We simulated the growth of each species individually to compare their monoculture growth rates with the nominal community nutrient uptake rates (Table S3). Interestingly, the three highest growth rates belonged to the rare pathogens Escherichia, Burkholderia, and Achromobacter, while the next three highest growth rates belonged to the common pathogens Pseudomonas, Streptococcus, and Staphylococcus (Fig. 3A; species numbered as in Table 1). These predictions were consistent with our modeling results for the gut microbiome (41), where opportunistic pathogens consistently had higher growth rates than commensal species. The other two species, Prevotella and Haemophilus, commonly observed in the 75 patient samples were predicted to have much lower in silico growth rates. The three species representing Fusobacterium, Granulicatella, and Porphyromonas did not grow individually due to their inability to meet the defined ATP maintenance demand, although they could grow when strategically combined with other modeled species. For example, Fusobacterium, Granulicatella, and Porphyromonas were predicted to grow in coculture with Ralstonia, Prevotella, and Actinomyces, respectively. The species abundances predicted for a specified nutrient condition depended on both the monoculture growth rates and the ability of each species to efficiently utilize secreted metabolites to enhance its growth rate. These emergent cross-feeding relationships allowed otherwise slower-growing species to coexist with species that exhibited high monoculture growth rates.
FIG 3
Single-species and community simulations performed with the nominal nutrient uptake rates in Table S3. (A) Single-species growth rates with the species numbered according to Table 1. (B) Comparison of predicted species abundances to the average of the normalized reads for the single patient infected with
We conducted simulations using the nominal nutrient uptake rates (Table S3) to determine if the community model could capture dominance of each rare pathogen in the absence of the other two rare pathogens. Each simulation was performed by constraining the abundances of the other two pathogens to zero, effectively producing reduced communities of 15 species. The predicted abundances from each simulation were compared to the normalized reads averaged over the patient samples which contained the associated pathogen:
We performed simulations for the remaining 43 patients by reducing the community to 14 species by constraining the abundances of all three rare pathogens to zero. The model-predicted abundances were compared to the normalized reads averaged over the 67 samples remaining when the 8 rare pathogen-containing samples were removed (Fig. 3E). The model correctly predicted that Pseudomonas, Streptococcus, and Prevotella would dominate the community, although the Prevotella abundance was overpredicted at the expense of Streptococcus as well as several less abundant genera. The only other genus present in the simulated community was Staphylococcus, while the averaged reads showed a greater amount of diversity. Compared to the averaged data, individual samples showed less diversity, which is more consistent with model predictions as discussed below.
The community model can reproduce pathogen heterogeneity across airway samples.
The CF airway communities exhibited a substantial degree of sample-to-sample heterogeneity when rare pathogens were present (Fig. 2A) or absent (Fig. 2B). We performed simulations to assess the extent to which sample-to-sample differences in taxonomic group reads could be explained by heterogeneity in the metabolic environment of the CF lung. More specifically, we randomized the community nutrient uptake rates around their nominal values (Materials and Methods; also see Table S3) to mimic heterogeneous lung environments shown to occur across CF patients (42, 43) and in longitudinal samples from a single patient (44). An objective of our future research will be to model sample-by-sample variability in individual patients as a function of disease state (e.g., clinically stable, pulmonary exacerbation, and antibiotic treatment). In this study, each simulation with a set of randomized uptake rates was termed a “simulated sample,” and we tested the hypothesis that the experimental samples could be interpreted as having been drawn from the much larger set of simulated samples we generated. Due to the relatively small number of
FIG 4
Taxonomic reads for patient samples containing rare pathogens compared to species abundances predicted from community models with randomized nutrient uptake rates. The genera Pseudomonas, Streptococcus, Prevotella, Haemophilus, and Staphylococcus and the indicated rare pathogen (
FIG 5
Taxonomic reads for patient samples without rare pathogens compared to species abundances predicted from community models with randomized nutrient uptake rates. The genera Pseudomonas, Streptococcus, Prevotella, Haemophilus, and Staphylococcus and the next most abundant genera are shown for each case. Individual models that best fit the 67 patient samples were selected from an ensemble of 1,000 14-species models without
Randomized nutrient simulations were able to generate model predictions that reproduced the major features of the 3
The lack of patient samples containing
The 22 samples which produced moderate prediction errors were characterized by lower and more variable Pseudomonas reads (48% ± 28%) as well as more variable distributions of Streptococcus and Prevotella reads (Fig. 5B). The ensemble of randomized models could capture the relative amounts of these three genera but often predicted the presence of Staphylococcus not observed in the patient samples. This discrepancy could be attributable to the unmodeled ability of Pseudomonas to secrete diffusible toxins which inhibit Staphylococcus respiration and render Staphylococcus less metabolically competitive in partially aerobic environments (49) such as the CF lung. Interestingly, the model ensemble could reproduce the relatively high Ralstonia reads in sample 1 while also predicting no Ralstonia in samples 15 and 69. The 23 samples which produced the largest prediction errors were characterized by much lower Pseudomonas reads (13%), higher reads of Streptococcus and Prevotella (34% and 19%, respectively; e.g., samples 26 and 74 in Fig. 5C), and higher representation of less common genera. These samples also produced higher Haemophilus reads, primarily due to two Haemophilus-dominated samples (e.g., sample 39 in Fig. 5C). While the model ensemble generally was able to reproduce the observed Streptococcus and Prevotella reads in these samples, the models tended to overpredict Pseudomonas and Staphylococcus at the expense of the less common genera. In particular, the ensemble underpredicted the abundances of Rothia, Fusobacterium, and Gemella while the average reads of these three genera across the 23 samples summed to 16% This discrepancy could suggest that these 23 samples were obtained from patients with less advanced CF lung disease, which correlates to higher diversity communities in vivo (30, 50).
To gain further insights into the ability of the community model to mimic sample-to-sample heterogeneity in the absence of rare pathogens, we compared read data and abundance predictions in the PC space calculated from the 67 patient samples. Each of the 1,000 model simulations was mapped into the two-dimensional space defined by the first two PCs (Fig. 2B), which explained 84.2% of normalized read data variance (Table S2). The model ensemble was able to reproduce most of the observed variability as reflected by the cloud of model simulations overlapping 56 of the 67 patient samples (Fig. 6A). The patient and simulated samples covered the same range of the first PC, which was heavily weighted by Pseudomonas, Streptococcus, and Prevotella (Table S2). Importantly, this consistency shows that heterogeneity across these three dominant genera could be predicted from variations in the CF lung metabolic environment, as we hypothesized.
FIG 6
Principal-component analysis (PCA) of taxonomic reads for patient samples without rare pathogens and species abundances predicted from 14-species community models with randomized nutrient uptake rates. (A) Representation of the 67 patient samples (blue crosses labeled with sample number) in the two-dimensional space defined by the first two principal components (PCs) obtained when PCA is performed on the normalized reads of these patient samples. Predicted species abundances (red circles) from an ensemble of 1,000 models transformed into the PC space of the normalized read data. (B) Enlarged view of the lower left portion of the PCA plot in panel A. (C) Average genus reads obtained for 12 samples (samples 5, 6, 10, 39, 42, 43, 49, 57, 61, 68, 70, and 74) in panel B with elevated Prevotella representation compared to the average abundances predicted from the best-fit models for these samples with the species number as in Table 1.
The model ensemble also could reproduce variations in the second PC, which was heavily weighted by the three dominant genera and Haemophilus, for sufficiently large values of the first PC, which corresponded to relatively high Pseudomonas and low Streptococcus and Prevotella. In contrast, the model ensemble did not cover the patient samples in the lower left quadrant of the PC plot (Fig. 6B). These samples were characterized by unusual combinations of relatively high Prevotella, Haemophilus, Rothia, and/or Fusobacterium that the model could not reproduce in its present form. Of these 12 poorly modeled samples, Prevotella was highly represented in 8 samples. When the normalized reads of these 8 samples and their associated best-fit abundances were averaged, the models overpredicted Pseudomonas, Streptococcus, and Staphylococcus at the expense of the less common genera (Fig. 6C).
The community model predicts that pathogen dominance is driven by metabolite cross-feeding.
To investigate putative metabolic mechanisms by which pathogens may establish dominance in the CF lung, we used model predictions to quantify rates of metabolite cross-feeding between species. For each rare pathogen (Escherichia, Burkholderia, and Achromobacter), 100 simulations performed with randomized community uptake rates were used to calculate average exchange rates of the five most significantly cross-fed metabolites between Pseudomonas, Streptococcus, and the pathogen of interest. The overall metabolite exchange rate from one species to another species was calculated by determining the minimum uptake or secretion rate for each exchanged metabolite and then summing these minimum rates over all exchanged metabolites.
Escherichia was predicted to consume the organic acids acetate, formate, and
FIG 7
Predicted metabolite cross-feeding relationships for 15-species communities containing Escherichia, Burkholderia, or Achromobacter. Negative rates denote metabolite uptake, and positive rates denote metabolite secretion. The overall metabolite exchange rate from one species to another species was calculated by determining the minimum uptake or secretion rate for each exchanged metabolite and then summing these minimum rates over all exchanged metabolites. The arrow thickness is proportional to the overall metabolite exchange rate between the two species. (A) Average exchange rates of the five highest cross-fed metabolites between the three most abundant species for 100 model ensemble simulations containing Escherichia. (B) Average exchange rates of the five highest cross-fed metabolites between the three most abundant species for 100 model ensemble simulations containing Burkholderia. (C) Average exchange rates of the five highest cross-fed metabolites between the three most abundant species for 100 model ensemble simulations containing Achromobacter. (D) Schematic representation of overall metabolite exchange rates for Escherichia-containing communities corresponding to panel A. Pseudomonas was omitted due to its low exchange rates compared to the other two species. (E) Schematic representation of overall metabolite exchange rates for Burkholderia-containing communities corresponding to panel B. (F) Schematic representation of overall metabolite exchange rates for Achromobacter-containing communities corresponding to panel C.
More complex cross-feeding relationships were predicted for Burkholderia-containing communities that supported average Pseudomonas and Streptococcus abundances both exceeding 10%. The highest exchange rates were predicted for formate and acetate produced by Streptococcus and consumed by Burkholderia (Fig. 7B and E). The two species also exchanged amino acids, with Streptococcus providing alanine to Burkholderia and Burkholderia producing aspartate and serine for Streptococcus. Burkholderia provided the same two amino acids to Pseudomonas while receiving a small exchange of acetate in return. Pseudomonas also consumed formate secreted by Streptococcus. These model predictions suggested that acetate, formate, and alanine produced by Streptococcus via heterolactic fermentation (52) could promote Burkholderia growth in vivo. Indeed, in vitro experiments have shown that mucin-degrading anaerobes such as streptococci may promote the growth of CF pathogens such as
Compared to the other two pathogens, Achromobacter was predicted to be less efficient at cross-feeding, having only low uptake rates of alanine,
Similar cross-feeding analyses were performed for 1,000 simulations with randomized nutrient uptake rates in 14-species communities lacking Escherichia, Burkholderia, and Achromobacter. To investigate the possibility of differential cross-feeding patterns, the simulations were split into 500 cases with the highest Pseudomonas abundances and 500 cases with the lowest Pseudomonas abundances (Fig. 8A). For each set of 500 simulations, the average exchange rates of the five most significantly cross-fed metabolites between the four most abundant species (Pseudomonas, Streptococcus, Prevotella, and Staphylococcus) were calculated. The overall metabolite exchange rate between any two species were calculated from the individual metabolite uptake and secretion rates as before.
FIG 8
Predicted metabolite cross-feeding relationships for 14-species communities without Escherichia, Burkholderia, and Achromobacter. One thousand model ensemble simulations were performed and split into 500 cases with relatively high Pseudomonas abundances and 500 cases with relatively low Pseudomonas abundances. (A) Average abundances of the five most highly represented species for the high- and low-Pseudomonas-abundance cases. (B) Average exchange rates of the five highest cross-fed metabolites between the four most abundant species for high-Pseudomonas-abundance cases. (C) Average exchange rates of the five highest cross-fed metabolites between the four most abundant species for low-Pseudomonas-abundance cases. (D) Schematic representation of overall metabolite exchange rates for high-Pseudomonas-abundance cases corresponding to panel B. (E) Schematic representation of overall metabolite exchange rates for low-Pseudomonas-abundance cases corresponding to panel C.
When Pseudomonas abundances were predicted to be relatively high (average of 61%), community interactions were dominated by Pseudomonas consumption of formate, ethanol, acetate, and aspartate secreted by the other three species (Fig. 8B). Formate cross-feeding was predicted to be particularly important, which was consistent with an in vitro study showing that expression of the
When Pseudomonas abundances were predicted to be relatively low (average of 32%), metabolite cross-feeding remained dominated by Pseudomonas consumption of secreted by-products and amino acids (Fig. 8C). Pseudomonas was predicted to have high consumption rates of formate produced by all three other species and
DISCUSSION
The airways of cystic fibrosis (CF) patients are commonly infected by complex communities of interacting bacteria, fungi, and viruses which complicate disease assessment and treatment. The unique bacterial communities resident in individual patients can be longitudinally resolved to the genus level by applying 16S rRNA gene amplicon library sequencing to sputum and bronchoscopy samples (8). While 16S rRNA gene sequencing technology provides an unprecedented capability to identify bacterial pathogens in the CF lung, other analyses are required to understand how community members interact and how these interactions impede or promote disease progression. Metabolomics represents a powerful tool to interrogate the complex metabolic environment of the CF lung (63), but the number and depth of studies published to date have been limited. Metabolic modeling is a complementary tool for probing complex microbial communities and their interactions mediated through competition for host-derived nutrients and cross-feeding of secreted metabolites (13). Community metabolic models can provide information difficult to obtain by purely experimental means, such as the combined impact of nutrient environment and metabolic interactions on community composition. Metabolic models also can predict the rates of metabolite exchange between species and identify cross-feeding relationships difficult to delineate through metabolomic analyses.
We used 16S rRNA gene sequence data from three published studies (28–30) to construct and test a metabolic model for prediction of airway community compositions in adult CF patients. The assembled data set consisted of 75 distinct samples from 46 patients who were judged to be stable or recovered from treatment in the original studies. Principal-component analysis performed on 16S read data showed considerable heterogeneity of community composition across the 75 samples, including three patients infected with
The community metabolic model was constructed by ranking the identified taxa according to their total reads across the 75 samples and representing each taxonomic group with a single genome-scale metabolic reconstruction obtained from the AGORA database (www.vmh.life) (21). To limit model complexity, only the 17 top-ranked taxa (16 genera and 1 combined family/genus) were included. The resulting in silico community contained the most common CF pathogens (
The community metabolic model required specification of host-derived nutrients that mimicked the CF lung environment in terms of the nutrients available, their allowed uptake rates across the community, and their allowed uptake rates by individual species. Given that the 17-species model contained 271 community uptake rates and a total of 2,378 species-specific uptake rates, a model tuning method was developed to manage the daunting complexity. A putative list of host-derived nutrients was compiled by starting with the synthetic sputum medium SCFM2 (66) and adding other nutrients either required for monoculture growth of at least one modeled species, measured in metabolomic analyses of CF sputum samples, or identified through in silico analyses. The resulting 81 nutrients were separated into 14 distinct groups (see Table S3 in the supplemental material) to facilitate tuning of nominal community uptake rates to qualitatively match average read data for the rare pathogen samples and the Pseudomonas/Streptococcus-dominated samples. This tuning process proved to be the bottleneck of model development even under the simplifying assumption that the species uptake rates were not limiting. A more streamlined and experimentally driven tuning process would be facilitated by the availability of matched 16S and metabolomics data for large sets of CF sputum samples.
Despite the challenges associated with defining physiologically relevant nutrient uptake rates, the community model was able to predict species abundance in qualitative agreement with average read data for
The 14-species model used to simulate the rare-pathogen-free samples predicted that Pseudomonas and Streptococcus would be the dominant genera and that Prevotella and Staphylococcus also would be present in the community. These predictions provided qualitative agreement with the 16S rRNA gene sequence read data averaged across the 67 samples, although the predicted abundance of Prevotella was comparatively high and the predicted diversity was comparatively low. Given the uncertainty associated with identifying host-derived nutrients and translating these available nutrients into appropriate community uptake rates, we considered our predictions to provide satisfactory in silico recapitulation of measured community compositions across the set of four dominant CF pathogens.
A hallmark of CF lung infections is poorly understood differences in bacterial community compositions between patients and in longitudinal samples collected from a single patient (42). We performed simulations to test the hypothesis that these differences might be partially attributable to sample-to-sample variations in the nutrient environment in the CF lung. Nutrient variability was simulated by randomizing the community uptake rates around their nominal values found through manual model tuning. We performed 100 model ensemble simulations for each 15-species community containing a rare pathogen to determine if the associated patient samples could be well fitted by a simulated sample. Using comparative plots of the measured reads and predicted abundances, we found that the model ensembles could satisfactorily reproduce the community compositions of the 8 rare-pathogen-containing samples. The best-fit models tended to provide good predictions of rare pathogen reads due to their relatively large values (average of 65% across the 8 samples), while the accuracy of read predictions for less prevalent species was more variable.
Due to the availability of a much larger data set of 67 patient samples, the rare-pathogen-free model consisting of 14 species afforded an opportunity to investigate sample-to-sample heterogeneity in more depth. We performed 1,000 model ensemble simulations with randomized nutrient uptake rates to find best-fit models. Patient samples with relatively high Pseudomonas reads tended to be well fit because the model predicted Pseudomonas dominance over a wide range of nutrient conditions. Less accurate but still satisfactory fits were obtained for patient samples with moderate Pseudomonas and relatively high Streptococcus reads. The model ensemble proved somewhat deficient in fitting samples with high reads of Prevotella or of the less common genera Haemophilus, Rothia, and Fusobacterium. This deficiency could be attributable to the in silico lung environment not containing key nutrients and/or not specifying sufficiently high uptake rates of supplied nutrients to support high abundances of these genera.
The quality of sample fits also was correlated with the sample diversity, with the best fits having the lowest average diversity (inverse Simpson index of 0.10), moderate fits having an intermediate average diversity (inverse Simpson index of 0.18), and poor fits having the highest average diversity (inverse Simpson index of 0.23). For these three sets of samples, the best-fit models had average diversities of 0.10, 0.16, and 0.20, respectively. We believe that the lower predicted diversities were attributable to the modeling assumption that the CF lung community maximizes its collective growth rate. Using a community metabolic model of the human gut microbiota (41), we have shown that increased bacterial diversity (typically associated with health) can be achieved by simulating suboptimal growth rates under the hypothesis that disease progression correlates with a collective movement toward maximal growth. Therefore, the assumption of maximal community growth may inherently limit our ability to accurately reproduce more diverse samples and rather simulate conditions associated with disease, such as dominance of a single pathogen.
By optimizing cross-feeding of secreted metabolites, the community model was able to predict the coexistence of multiple species at the maximal community growth rate rather than just predicting a monoculture of the single species with the highest monoculture growth rate. Because the SteadyCom method (27) used to formulate and solve the community model does not allow direct incorporation of mechanisms by which one species could inhibit the growth of another species other than by nutrient competition, the predicted community growth rate always was higher than the highest individual growth rate of the coexisting species. Consequently, the formulated model was incapable was capturing more complex interactions such as Pseudomonas secretion of diffusible toxins that inhibit the growth of other CF pathogens (67).
Despite this limitation, the community model could be analyzed to understand the putative role of metabolite cross-feeding in shaping community composition. The model predicted that the rare pathogens Escherichia and Burkholderia were particularly efficient cross-feeders, using acetate, formate, and other secreted metabolites to establish dominance over less harmful bacteria. In contrast, the model predicted Achromobacter to be substantially less adept at exploiting secreted metabolites for growth enhancement. While we were able to simulate Achromobacter dominance through addition of four carbon sources possibly present in the CF lung, the model suggested that other nonmodeled mechanisms may be involved in promoting Achromobacter expansion. One possibility is that Achromobacter utilizes its ability to form multispecies biofilms (46, 68) to establish favorable metabolic niches for enhanced growth.
In the absence of the three rare pathogens, the model predicted that Pseudomonas would be the primary beneficiary of cross-fed metabolites, including acetate, alanine, and
Our community metabolic model generated several predictions that could be tested experimentally with an appropriately designed in vitro community. For example, a 5-species in vitro system consisting of
MATERIALS AND METHODS
Patient data.
CF airway community composition data were obtained from three published studies in which patient sputum samples were subjected to 16S rRNA gene amplicon library sequencing (28–30). The first study (28) included 30 samples from 10 clinically stable adults ranging in age from 20 to 50 years with an average age of 35 years, the second study (29) included 23 samples from 14 adults in clinically defined baseline and recovery stages ranging in age from 18 to 69 years with an average age of 34 years, and the third study (30) included 22 samples from 22 clinically stable adults ranging in age from 19 to 52 years with an average age of 28 years. Thus, in total, the assimilated data set contained 75 distinct samples from 46 patients who were clinically stable or recovered from treatment for an exacerbation event. Additional samples from these three studies corresponding to exacerbation or antibiotic treatment were not included in the modeled data set to avoid the complications of predicting these events. The top 72 taxonomic groups (typically genera) accounted for over 99.8% of total reads across the 75 samples (Fig. 1A; also see Table S6 in the supplemental material). To limit complexity, the community metabolic model described below was limited to 17 taxonomic groups that accounted for 95.6% of total reads (Fig. 1B; Table S4). Reads from the family
Community metabolic model.
For simplicity, each genus was represented by a single species commonly observed in CF airway communities (1, 6–9, 70), although we note that genera such as Streptococcus (30) can have considerably diversity with respect to species representation. As mentioned above, the combined
The genera Pseudomonas, Streptococcus, and Prevotella dominated most communities, in terms of both average reads for individual samples and the number of samples in which they exceeded 1%. Interestingly,
Model tuning and simulation.
The nutrient environment in the CF lung is complex and expected to vary between patients as well as between longitudinal samples for individual patients depending on disease state. While metabolomic analyses have been performed on CF sputum and bronchoscopy samples (42, 63, 70, 71), these studies were insufficient to define supplied nutrients for the metabolic model due to their limited metabolite coverage. Furthermore, we found that based on our model, the synthetic sputum medium SCFM2 used in previous in vitro CF microbiota studies (66, 72) would not support growth of any of the 17 modeled species due to the lack of ions (Co2+, Cu2+, Mn2+, and Zn2+), amino acids (asparagine and glutamine), and other metabolites (see below) essential for growth. While the medium likely would contain trace amounts of the missing ions, the requirement of these other metabolites for growth suggests limitations for the AGORA metabolic models with respect to biosynthetic pathways leading to biomass formation. Given the semicurated nature of the AGORA models (21), such discrepancies were expected and had to be addressed by adding the missing essential metabolites to the modeled medium. A final complication was that the community model required specification of nutrient uptake rates, which were unknown even if medium component concentrations were specified due to the lack of species-dependent uptake kinetics for each nutrient. Because such uptake information is rarely available even for highly studied model organisms such as
Supplied nutrients in the community model were defined by starting with the SCFM2 medium and adding the four ions and two amino acids listed above. We found that each species required additional metabolites in the medium to support biomass formation. These 29 additional metabolites were identified and added to the modeled medium such that all 17 species were capable of monoculture growth (see Table S3). For example, the
The community uptake rates of the 86 supplied nutrients were tuned by trial and error to produce species abundances in approximate agreement with the average reads listed in Table 1, which were derived from actual patient samples. To reduce the number of adjustable rates, the nutrients were grouped together and a single uptake rate was used for each group. These 14 groups (Table S3) were defined as follows: group 1, 16 common metals and ions; group 2, 29 essential growth metabolites; group 3, 8 CF lung metabolites; group 4, 19 amino acids; group 5, the amino acids alanine and valine, which have been reported to be elevated in the CF lung compared to other amino acids (71); groups 6 to 11, each of the 6 carbon sources available in the CF lung; group 12, O2; group 13, NO3; and group 14, 4 Achromobacter-related carbon sources. The 86 nutrients and their nominal community uptake rates determined through this tuning procedure are listed in Table S4 and depicted graphically in Fig. 1D.
Because these nutrient uptakes rates were derived for the entire patient population and not an individual patient sample, a different strategy was used to simulate sample-to-sample heterogeneity based on the hypothesis that differences in nutrient availability could account for heterogeneity in measured reads. Individual patient samples were simulated by randomly perturbing the community uptake rate for each of the 14 nutrient groups listed above between 33% and 300% of its nominal value. Uniformly distributed random numbers were generated for each group such that the numbers of cases with the uptake rates in the ranges 33% to 100% and 100% to 300% were statistically equal. The bounds used for the uptake rate of each metabolite also are listed in Table S3. The CF lung is known to exhibit sharp O2 gradients such that some regions are hypoxic or even anoxic (77, 78). The community model accounted for the effects of the average O2 level through the randomized uptake rates. At the nominal oxygen uptake rate of 5 mmol/g dry weight (gDW)/h in Table S3, the 17 species had an average growth rate of 0.140 h−1. At the low oxygen uptake value of 1.67 mmol/gDW/h, the 17 species had an average growth rate of 0.096 h−1. Given that the maximum O2 uptake rate of
Community simulations.
We used the SteadyCom method (27) to perform steady-state community simulations as detailed in our previous study on the human gut microbiota (41). SteadyCom performs community flux balance analysis by computing the relative abundance of each species for maximal community growth while ensuring that all metabolites are properly balanced within each species and across the community. This simulation method is based on several simplifying assumptions, including that each sputum sample was obtained from a spatially homogeneous region of the CF lung, that all modeled species have an equal opportunity to colonize the airway, and that all propagating species have the same growth rate at steady state. Therefore, the community model was not capable of predicting sequential colonization by various species (45) or different growth rates of propagating species (80). Each species model used a non-growth-associated ATP maintenance (ATPM) value of 5 mmol/gDW/h, which is within the range reported for curated bacterial reconstructions. Cross-feeding of all 21 amino acids and 8 common metabolic by-products (acetate, CO2, ethanol, formate, H2,
Data availability.
All data used for metabolic model development and testing are provided in the supplemental material.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2019 Henson et al. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
ABSTRACT
Cystic fibrosis (CF) is a fatal genetic disease characterized by chronic lung infections due to aberrant mucus production and the inability to clear invading pathogens. The traditional view that CF infections are caused by a single pathogen has been replaced by the realization that the CF lung usually is colonized by a complex community of bacteria, fungi, and viruses. To help unravel the complex interplay between the CF lung environment and the infecting microbial community, we developed a community metabolic model comprised of the 17 most abundant bacterial taxa, which account for >95% of reads across samples, from three published studies in which 75 sputum samples from 46 adult CF patients were analyzed by 16S rRNA gene sequencing. The community model was able to correctly predict high abundances of the “rare” pathogens
IMPORTANCE Cystic fibrosis (CF) is a genetic disease in which chronic airway infections and lung inflammation result in respiratory failure. CF airway infections are usually caused by bacterial communities that are difficult to eradicate with available antibiotics. Using species abundance data for clinically stable adult CF patients assimilated from three published studies, we developed a metabolic model of CF airway communities to better understand the interactions between bacterial species and between the bacterial community and the lung environment. Our model predicted that clinically observed CF pathogens could establish dominance over other community members across a range of lung nutrient conditions. Heterogeneity of species abundances across 75 patient samples could be predicted by assuming that sample-to-sample heterogeneity was attributable to random variations in the CF nutrient environment. Our model predictions provide new insights into the metabolic determinants of pathogen dominance in the CF lung and could facilitate the development of improved treatment strategies.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer