INTRODUCTION
The capsule, an extracellular polysaccharide matrix, is one of the most striking virulence mechanisms considered essential for establishment of infection and for its protective effect against desiccation, phages, and protists predation (1). Though it renders attractive properties as a target for vaccine development, the success of immunotherapeutic approaches depends on a complete understanding of bacterial surface structures circulating in the clinical setting (3, 4). Whereas it is known that the expression of specific virulence factors and capsular (K) types (mainly K1 and K2) is related to severe infections caused by HV strains, much larger variation of K types has been described among clinical MDR strains, providing a higher resolution than multilocus sequence typing (MLST) and specific capsule-lineage associations that might be useful for typing (5–7). Variation on K antigens and also in other surface polysaccharides (such as O antigen) has been traditionally used for Klebsiella typing. In fact, serotyping, established as early as 1926 (8), allowed the recognition of 77 serologically distinct K types (K1 to K82) and much less diverse O types (n = 8, O1 to O12) among the reference strain collection, deposited at Statens Serum Institute, Copenhagen, Denmark (9, 10). The chemical composition and structure of capsular types has been clarified essentially during the 1980s for strains from the reference collection, but correlation with genomic data is recent and not always straightforward (11). The lack of practicability (it is complex and laborious) and availability (only at reference centers) of serotyping and the insufficient coverage led to its almost complete abandonment in the past decades (1).
Given the renewed interest in capsular polysaccharides, several genotypic methods have recently been proposed to revive K typing, but there are several flaws that prevent their universal application and coverage. Molecular methods to infer K type from genomic data such as restriction fragment length polymorphism (RFLP) of the cps locus (generating “C patterns”) or PCR targeting specific K types (e.g., K1, K2, and K57) are technically demanding, have low coverage, and/or are not suitable to detect variation in other sites of the locus (12, 13). K-type prediction based on allelic variation of loci (e.g., wzi or wzc) within the capsule biosynthetic pathway (cps locus) constitute more rapid and simple approaches (14, 15), but the characterization of the whole cps and rfb (O-antigen biosynthesis) locus by whole-genome sequencing (WGS) improved accessibility and precision, especially through user-friendly Web-based platforms, such as Kaptive (http://kaptive.holtlab.net) (7, 16). These in silico studies and others uncovered a series of novel cps loci and at least 161 presumptive phenotypically distinct capsular types (designated KL to differentiate from the reference K types) (7, 16–18). More importantly, these data revealed the usefulness of K variation as an epidemiological marker for strain subtyping, encouraging the development of reliable, fast, and high-throughput tools for K typing (7, 17, 19–21).
Fourier transform infrared (FT-IR) spectroscopy has been shown to detect surface phenotypic differences linked to a variable composition on glycan structures that form part of the O and K antigens, depending on the bacterial species (22–25). Considering that the capsule is the outermost structure of
RESULTS
Molecular genotypic characterization of
The existing methodologies for K typing are suboptimal and there is a lack of genotype-biochemical correlation. We thus used FT-IR spectroscopy combined with molecular methods to identify known K types or predict the composition of unknown K types.
Our approach was validated on a collection of 154 well-characterized MDR
TABLE 1
Detailed characterization of the 154 international MDR
| ST/CG (no.) | K/KL type by genotypic methodsa (no.) | O typeb | FT-IR type (no.) | PFGE type(s) (no.) | Country | Yr(s) of isolation | β-Lactamase(s) conferring resistance to extended-spectrum β-lactams |
|---|---|---|---|---|---|---|---|
| ST11/CG258 (32) | K24 (12) | O1, O2 | FT5 (11), FT24 (1) | Kp13 (1), Kp14 (10), Kp15 (1) | Spain, Portugal | 2010–2012 | DHA-1, OXA-48, KPC-3 |
| K27 (8) | O2 | FT17 | Kp17 (6), Kp18 (1), Kp19 (1) | Brazil | 2012 | KPC-2, CTX-M-2 | |
| K64f (3) | O2 | FT4 | Kp28 | Brazil | 2012 | KPC-2, CTX-M-2 | |
| KL105 (7) | O2 | FT13 (5), FT23 (2) | Kp30 | Portugal | 2006–2013 | DHA-1, DHA-6 | |
| KL127 (2) | —c | FT16 | Kp31 | Brazil | 2009–2012 | KPC-2 | |
| ST14/CG14 (9) | K2 (3) | O1 | FT14 | Kp16 | Portugal | 2002–2003 | TEM-24 |
| K16 (6) | O1 | FT15 | Kp32 | Portugal | 2010 | SHV-55, SHV-106 | |
| ST15/CG15 (33) | K19 (6) | O1 | FT10 | Kp1 | Portugal | 2012–2015 | KPC-3, CTX-M-15 |
| KL112 (11) | O1 | FT12 (10), FT27 (1) | Kp5 | Portugal | 2010–2012 | CTX-M-15 | |
| K24 (13) | O1 | FT5 (11), FT21 (1), FT26 (1) | Kp12 | Portugal, Brazil | 2006–2014 | CTX-M-15, OXA-48, SHV-2 | |
| KL110 (2) | O1 | FT19 | Kp21 | Portugal | 2011–2012 | VIM-34, SHV-12, OXA-17 | |
| KL48 (1) | O1 | NId | Kp12 | Portugal | 2010 | SHV-12 | |
| ST17/CG17 (4) | KL112 (3) | O2, O5 | FT12 (2), FT22 (1) | Kp6 (2), Kp7 (1) | Brazil, Portugal | 2012 | DHA-1, SHV-2, SHV-12 |
| (wzi200)e (1) | — | NI | Kp34 | Spain | 2012 | VIM-1 | |
| ST39/CG39 (7) | K23 (7) | O1 | FT1 | Kp33 | Spain | 2008–2010 | VIM-1 |
| ST54/— (3) | K14f (3) | O3 | FT2 | Kp29 | Spain | 2009–2010 | VIM-1 |
| ST101/CG101 (11) | K17 (11) | O1 | FT8 (8), FT20 (3) | Kp8 (6), Kp9 (2), Kp10 (2), Kp11 (1) | Romania, Brazil | 2012 | OXA-48, OXA-181, NDM-1, KPC-2, CTX-M-2, CTX-M-15 |
| ST147/CG147 (15) | K64f (15) | O2, O1 | FT4 | Kp26 (14), Kp27 (1) | Spain, Portugal | 2006–2015 | KPC-3, SHV-12, VIM-1 |
| ST253/— (3) | K60 (3) | O1 | FT3 | Kp20 | Spain | 2009–2010 | VIM-1 |
| ST258/CG258 (16) | KL106 (14) | O2 | FT6 (12), FT25 (2) | Kp24 (6), Kp2 (4), Kp25 (4) | Greece, Poland, Brazil | 2007–2009 | KPC-2, CTX-M-2, SHV-12 |
| KL107 (2) | O2 | FT9 | Kp2 | Poland | 2008–2009 | KPC-3, CTX-M-3 | |
| ST336/CG17 (12) | (wzi150)e (12) | — | FT24 | Kp23 | Portugal | 2010 | CTX-M-15 |
| ST348/— (4) | K62 (4) | O1 | FT11 | Kp22 | Portugal | 2012 | KPC-3, CTX-M-15 |
| ST405/— (5) | KL151 (5) | O4 | FT7 | Kp3 (4), Kp4 (1) | Portugal, Spain | 2012–2013 | OXA-48, CTX-M-15 |
a
Capsular (K) type defined according to Brisse et al. (14) and capsular locus (KL) type according to Wick et al. (7).
b
O types defined according to Fang et al. (10).
c
—, not defined.
d
NI, not included.
e
wzi allele associated with no or multiple K/KL types.
f
Cross-reaction between K14 and K64, solved by sequencing of wzy.
(i) Capsular assignment based on the genotypic marker wzi.
First, K types were inferred by sequence comparison of a discriminatory molecular marker (wzi) (14). Twenty-two different wzi alleles were identified, four of which (wzi89 and wzi200 to -202) were new and deposited at the BIGSdb-Kp Pasteur database (http://bigsdb.pasteur.fr/klebsiella/klebsiella.html), and described meanwhile in other studies (16). According to this database, 13 wzi alleles were unequivocally associated with one unique K type (positive reaction with the sera from reference K types) and/or KL type (predicted based on cps locus obtained by WGS data, when available) (Table 2). However, prediction of K type was not always straightforward, since (i) 6 wzi alleles were linked to more than one K type/KL type, (ii) 2 wzi alleles (wzi29 and wzi93) were linked to discordant K type/KL type, and (iii) 1 wzi allele (wzi200) has no K/KL type attributed (Table 2).
TABLE 2
K/KL type prediction by the different methods tested
| wzi allele | K typea | KL typeb | Epidemiological datac | FT-IR |
|---|---|---|---|---|
| 2 | K2 | KL2.KL30 | KL2 | K2 |
| 14 | K14 | KL14 | KL14 | K14 |
| 16 | K16 | KL16.KL143 | KL16 | K16 |
| 19 | K19 | KL19 | KL19 | K19 |
| 24 | K24 | KL24.KL54.KL55 | KL24 | K24d |
| 27 | K27 | KL27 | KL27 | K27 |
| 29 | K41 | KL106 | KL106 | KL106d |
| 64 | K14.K64 | KL64 | KL64 | KL64 |
| 75 | —e | KL105 | KL105 | KL105d |
| 83 | K23 | KL23 | KL23 | K23 |
| 89f | — | KL110 | KL110 | KL110 |
| 93 | K60 | KL112 | KL112 | KL112d |
| 94 | — | KL62 | KL62 | K62 |
| 101 | K24 | KL24 | KL24 | K24 |
| 137 | K17 | KL17 | KL17 | K17d |
| 143 | — | KL151 | KL151 | KL151 |
| 150 | — | KL163.KL27.KL46 | — | —g |
| 151 | — | KL48 | KL48 | — |
| 154 | — | KL107 | KL107 | KL107 |
| 200f | — | — | — | — |
| 201f | — | KL60 | KL60 | KL60 |
| 202f | — | KL127.KL155 | KL127 | KL127 |
a
Known associated K types; serological reaction tested by Brisse et al. (14).
b
KL, K locus; associated cps cluster type.
c
The KL types confirmed by WGS results are in boldface font, the remaining are based in frequency distribution (KL127).
d
Presence of outliers outside the main cluster (referred to as exception isolates in the manuscript).
e
—, not defined.
f
New wzi alleles submitted to BIGSdb.
g
According to FT-IR results, we can discard the possibility of being KL27.
(ii) Capsular prediction based on wzy or epidemiological data.
Some of these uncertain K types were additionally defined by sequencing of another molecular marker (wzy), by analysis of available epidemiological data (where the most frequently reported K type for a given sequence type [ST] was considered), and by WGS and Kaptive (see below) (Table 2) (7, 15, 16). Sequencing of wzy allowed distinguishing K14/K64 predicted by wzi64, whereas the epidemiological information (coupled with WGS in most of the cases) supported the prediction of capsular types K2, K16, K24, KL106, KL112, and KL127. With this approach, a total of 19 different K types were predicted by wzi/wzy sequencing that varied in frequency between 0.7% to 17.7% (Table 1). Twelve of them belong to serologically defined K types (K2, K14, K16, K17, K19, K23, K24, K27, KL48, KL60, KL62, and K64) and 7 are K types presumptively associated with a new composition/structure (KL105, KL106, KL107, KL110, KL112, KL127, and KL151) (Table 1).
It is of interest to highlight that most K types identified in this collection were specifically and uniquely associated with evolutionarily related strains from different countries and recovered from extended periods of time (Table 1). Some of them correspond to well-established clades from ST11/CG258 (7, 21, 28), CG15, CG14 (7, 20, 29), and ST258/CG258 (19, 30) identified in previous studies. Occasionally, the same K type was observed in different clones (e.g., K24 in ST11 and ST15, or K64 in ST11 and ST147) (Table 1).
Molecular genotypic characterization of
Considering that the O antigen can in some isolates protrude to the bacterial cell surface depending on the amount and type of the capsule, we cannot disregard its potential contribution to the biochemical makeup of the bacterial cell surface. In this sense, a molecular genotypic PCR-based approach was used to identify the most frequent O types previously recognized among
Whole cps-based K-type assignments.
The whole cps cluster of the 19 wzi-defined K/KL types provided full resolution and supported the assignment of KL types for which the composition/structure is still unreported. Furthermore, it allowed us to detect changes in sites of the cps locus other than wzi or wzy that may influence the final capsule composition. We used cps of the reference
FIG 1
Representation of the cps loci identified in this study. Arrows indicate the direction, proportional length, and function (colored as per legend) of protein-coding genes. cps genetic clusters are labeled by the KL type; wzi alleles, size in base pairs, and GenBank accession numbers are indicated. 1, this study; HP, hypothetical protein; GT, glycosyltransferase; AcT, acyltransferase; GMD, GDP-mannose-4,6-dehydratase; GMH, GDP-mannose mannosyl hydrolase; HAS, hyaluronan synthase; UGM, UDP-galactopyranose mutase.
cps clusters represented in Fig. 1 presented a variable size (20 to 30 kb) and were delimited by the conserved galF (encodes a UTP-glucose-1-phosphate uridylyltransferase responsible for the synthesis of UDP-
(i) Analysis of cps genes involved in sugar synthesis.
In a close analysis of all cps clusters, special attention was paid to the presence of genes associated with the synthesis of particular sugars: (i) initial glycosyltransferases responsible for triggering capsule synthesis. The wbaP (encoding an undecaprenyl phosphate galactose transferase) and wcaJ (encoding an undecaprenyl-phosphate glucose-1-phosphate) genes were detected in 8 or 9 of the cps clusters, respectively. The corresponding proteins revealed a high degree of homology (∼70% identity) and are, respectively, predictive of the presence of galactose or glucose on the repeat unit (11). (ii) Genes responsible for the synthesis of
Differentiation of K types by FT-IR spectroscopy.
FT-IR spectroscopy detects variation of the vibrational modes of chemical bonds that are exposed to infrared radiation, and when applied to bacterial cells, it provides a highly specific whole-organism fingerprint that reflects their biochemical composition (22). The methodology we used is simple and inexpensive, since one bacterial colony is directly applied to an instrument with small amounts of consumables and low maintenance (see Materials and Methods for further details). Moreover, the time to result is very short, since one isolate can be typed in ca. 5 to 10 min at a lower cost (from 30%) than with competing DNA-based methods (22). Hence, we evaluated the ability of this methodology to differentiate the 19
(i) General features of FT-IR spectral data. FT-IR spectra of all
FIG 2
PLSDA model 1. (A) Score plot of the PLSDA regression model 1 according to K types corresponding to the first three latent variables (LVs). (B) Confusion matrix for
FIG 3
PLSDA model 2. (A) Score plot of the PLSDA regression model 2 according to K types corresponding to the first three latent variables (LVs). The red circle includes the isolates that have a different phenotypic behavior from their main class, referred to as exception isolates in the manuscript. (B) Confusion matrix for
(ii) Full K-type resolution in two PLSDA models. In model 1 (19 classes modeled) (Fig. 2A), 12 clusters of isolates exhibiting 12 different K types were perfectly distinguished with 100% of total correct K-type predictions (Fig. 2B). These clusters included isolates belonging to O1 or O2 (e.g., K64 isolates). In fact, O1 and O2 have highly similar structures that are most probably indistinguishable by FT-IR spectroscopy. They are both composed of galactose homopolymers (alternating β-
Thus, the FT-IR-based typing method discriminated the 19 different K types tested, supporting differences in their final capsule composition or structure, including the biochemically uncharacterized KL types. It provided a resolution identical to that of whole cps sequencing for discriminating closely related K types (K14 and K64) or discrepant K/KL types (KL60 and KL112) (Table 2). Moreover, not only were precise biochemical-genotypic correlations established, but also, this methodology depicted differences in a few exception isolates (8.9% [12/152]) that were not predicted by molecular genotypic data. These isolates presented changes in sites of the locus that were not detected or could be neglected by genotypic approaches (Tables 1 and 2; see also below).
(iii) Exploring capsular discrepancies between genotypic methods and FT-IR. First, one KL112 isolate (ST17) incorrectly predicted might represent one of the few cases where differences in the O type might impact on FT-IR spectra. This isolate was classified as O5, which is composed of a homopolymer of mannose (instead of
FIG 4
Spectral and genetic representation of KL105
Thus, FT-IR spectroscopy can reliably detect differences in capsule composition of main
Statistical analysis.
The discriminatory power of FT-IR was calculated by using the Simpson’s index of diversity (SID) applied to the test population for all the typing methods considered (FT-IR, MLST, wzi sequencing, and epidemiological data). The SID for FT-IR was 0.932, a higher value than those obtained for wzi sequencing (0.918) or epidemiological data (0.916) (see Table S2). To assess the congruence between the typing methods, we calculated the Wallace coefficient (see Table S3). This coefficient reflects the likelihood of two isolates assigned to the same type by one method (e.g., FT-IR) being classified together using another typing method (e.g., wzi sequencing). The high coefficients for FT-IR and epidemiological data (1.000) and for FT-IR and wzi (0.966) indicate that the combination of either epidemiological data or wzi-based K-type predictions to FT-IR adds no or little additional strain discrimination. Furthermore, the chance that two isolates sharing the same FT-IR type also shared the same ST is lower (67.5%), reflecting the lower discriminatory power of MLST.
Correlation between FT-IR K types and capsule biochemical composition.
To unequivocally settle the basis for FT-IR-based K-type discrimination, we represented the similarity of the spectra in a dendrogram generated by hierarchical cluster analysis (HCA) and correlated the FT-IR-based assignments with the biochemical composition of the different known K types (Fig. 5). In this figure, we can see that the same 19 K types were also discriminated in clusters defined at distances of <0.4 (Fig. 5A). In parallel, we represented in Fig. 5B the composition and structure of 12 of 19 known K types, for which their source information is included in Table S1. We observed that these capsular types exhibit a marked diversity of patterns based on the size of the polysaccharide polymer, the number and type of monosaccharides, the type of linkages or the presence of side chains, or modifications of the lateral sugars that are on the basis of correct FT-IR-based discrimination. They vary between tetra- and heptasaccharides made up of glucose, glucuronic acid, mannose, rhamnose, fucose, galactofuranose, or galacturonic acid in different proportions and orders, though some appear to have similar structures (Fig. 5B).
FIG 5
Clustering and biochemical composition and of K/KL types detected in this study. (A) Known CPS biochemical structures of
(i) Analysis of similar K types inferred from FT-IR spectra. A high similarity between K types K19 and KL107 (Fig. 5A, branch A), K17 and K24 (Fig. 5A, branch C), K14 and K64 (Fig. 5A, branch G), and K2 and K23 (Fig. 5A, branch J) is inferred from the HCA (distances of <0.2), which is supported by their closely related K-type structures, as explained below (Fig. 5B).
(ii) K19 and KL107. These capsular types are both composed of hexasaccharides that have in common a high number of rhamnose residues (3 and 5, respectively) and vary slightly in the compositions of other sugars. Whereas K19 contains a polymer of
(iii) K17 and K24. These capsular types consist of similar pentasaccharide structures, containing
(iv) K14 and K64. They are composed of highly similar hexasaccharides composed of
(v) K2 and K23. These two capsular types are characterized by tetrasaccharides in different configurations, composed of
Additionally, KL60, KL62, and K27 were all grouped in branch F from Fig. 5A, which also included a few K types for which the structure is not known, and for this reason, any comparison lacks robustness. We observed that KL60, KL62, and K27 are diverse in structure (penta-heptasaccharides) and composition (variable but especially enriched in glucose). K16 appears in a separate branch (E) from the dendrogram and is clearly distinguished from all the others, since it is formed by a tetrasaccharide containing
The correlations established strengthen FT-IR-based K-type assignments and highlight the need to both characterize the structure/composition of new KL types and increase the reliabilities of the clustering and the comparisons with a higher number of isolates from certain K types.
Prediction of the capsular composition based on FT-IR spectroscopy assignments.
Several K types included in this study are observed in worldwide-spread
Since spectra obtained from isolates exhibiting KL105 and KL127 clustered with K2 and K23 types (distance < 0.3), we predict a tetrasaccharide structure composed of
Thus, using our FT-IR-based framework, we predicted for the first time the presumptive structure/composition of new KL types, which was supported by cps genotypic data. Further studies are needed to validate these predictions and potentiate the use of FT-IR spectroscopy for K-type identification and characterization.
DISCUSSION
In this study, we establish for the first time a framework to support FT-IR spectroscopy as an accurate, simple, quick, and inexpensive method for the characterization and identification of
In fact, the importance of surface structures (and especially the capsule) on evolution, pathogenesis, and host adaptation of bacterial pathogens is well known. However, full understanding on K-type variation in
Currently, the correct prediction of K types in
The ability of FT-IR spectroscopy to discriminate and identify
Our in-house FT-IR
IR spectra can also be used for bacterial differentiation at the species level, but there are not yet reliable databases (23). Thus, in clinical microbiology laboratory routines, we envision that FT-IR spectroscopy can be used downstream of matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) species identification for quick detection of
We recognize that the coverage of our database needs to be enlarged to represent as much K-type diversity as possible toward a clinical application in a wider epidemiological context. Also, strain typing will always depend on the stability of strain capsular traits and the establishment of reliable genotypic-biochemical correlations. For routine clinical applications, the method needs to be adapted for a nonspecialist user, which depends on the creation of judicious databases under standardized conditions and automation of data analysis. This problem was partially solved by Bruker, who launched in June 2016 a dedicated FT-IR-based equipment (IR Biotyper) for routine outbreak detection using a simple and automated process.
MATERIALS AND METHODS
Bacterial strains.
One-hundred fifty-four well-characterized MDR
Genotypic and phenotypic characterization of surface polysaccharide structures.
In all isolates, PCR and sequencing of specific genetic markers were used for genotyping of K and O types. For the genotypic-based prediction of K types, we sequenced a 447-bp fragment from a highly variable region of wzi and, occasionally, specific wzy fragments (14, 15). Regarding O genotyping, specific regions of wzm and wzt genes from the rfb cluster were amplified for O1/O2, O3, and O5 identification. Furthermore, an additional PCR was performed to distinguish O1 and O2 and its variants (designed in the wbbY loci unlinked to the rfb cluster) (10, 17). Additionally, WGS was performed for 9 isolates for which discrepancies between genotypic and biochemical data were observed. WGS was performed by Illumina MiSeq (2× 300-bp pair-ended runs, ∼6 Gb per genome, coverage 100×), and reads were assembled using SPAdes version 3.9.0 (cab.spbu.ru/software/spades/); the full cps locus was further annotated with Geneious R10 software (Biomatters Ltd., Auckland, New Zealand) considering the nomenclature proposed by Reeves et al. (43).
Biochemical characterization of surface bacterial components was performed using FT-IR spectroscopy with attenuated total reflectance (ATR) mode, as previously described (35, 36). Briefly, isolates were grown on Mueller-Hinton agar at 37°C for 18 h, and colonies were directly transferred from the agar plates to the ATR crystal and air-dried in a thin film. Spectra were acquired using a Perkin Elmer Spectrum BX FT-IR system spectrophotometer in the ATR mode with a PIKE Technologies Gladi ATR accessory from 4,000 to 600 cm−1, a resolution of 4 cm−1, and 32 scan coadditions. For each isolate, at least three instrumental replicates (obtained from the same agar plate in the same day) and three biological replicates (obtained in three independent days) were acquired and analyzed, corresponding to a minimum of nine spectra per strain (36, 44).
Spectral data analysis.
All chemometric analyses was performed using Matlab R2015a version 8.5 (MathWorks, Natick, MA) and PLS Toolbox version 8.5 for Matlab (Eigenvector Research, Manson, WA, USA). Original FT-IR spectra were processed with standard normal variate (SNV) followed by the application of a Savitzky-Golay filter (9 smoothing points, second-order polynomial, and second derivative) (45, 46). Prior to modeling with PLSDA, spectra were mean centered. Due to the amount of generated data and for simplification of the visualization, a mean spectrum of each isolate (resulting from at least nine congruent replicates validated by a principal-component analysis [PCA]-based internal script). Spectra were analyzed by a supervised (partial least-squares discriminant analysis [PLSDA]) chemometric model using, for discriminatory purposes, the region of the spectra corresponding to the carbohydrate vibrations (W4, 1,200 to 900 cm−1) (31). PLSDA is a supervised method based on the PLS regression method. In PLSDA models, we assign to each isolate spectrum (xi) a vector of zeros with the value of 1 at the position corresponding to its class (yi, ST or K type) in such a way that categorical variable values (yi) can be predicted for samples of unknown origin. Model loadings and the corresponding scores were obtained by sequentially extracting the components or latent variables (LVs) from matrices X (spectrum) and Y (matrix codifying K types). In PLSDA, a probability value for each assignment is estimated for each sample. The number of latent variables (LVs) was optimized using the leave-one-sample-out cross-validation procedure in order to prevent overfitting, considering only 70% of the available data (randomly selected). After optimization of the number of LVs, the model was tested on the remaining 30% samples in order to assess the proportion (%) of correct predictions for each class (36, 44, 47). We used a 1,000× bootstrap for this procedure to ensure the robustness of this internal validation.
The unsupervised method hierarchical cluster analysis (HCA) was also applied to evaluate the spectral similarity between isolates (and eventually to correlate clusters with K-type structures). The dendrograms were obtained using Ward’s algorithm, as previously described (36). Thirteen components were retained with a total accumulated variance of 96.32%. The same preprocessing and scaling used for PLSDA was used for HCA.
The discriminatory ability of FT-IR spectroscopy compared to that of MLST, PFGE, wzi sequencing, and epidemiological data was measured using the Simpson’s index of diversity (SID) (Table S2). The congruence between the typing methods was calculated using the adjusted Wallace coefficient (Table S3). All calculations were conducted using the Comparing Partitions website (http://www.comparingpartitions.info/index.php?link=Tool). Pairwise comparisons were performed on data sets in which missing data (e.g., a K type could not be determined by one of the methods) were not considered.
Data availability.
The sequences of the complete cps operon were deposited in the GenBank database under the accession numbers MG602975 to MG602982 and under the BioProject PRJNA408270. The sequence for K24 isolate (H1119) predicted by wzi sequencing with a recombinant K24/K39 cps locus was deposited in the GenBank database under accession number NXBK00000000).
b LAQV/REQUIMTE, Departamento de Ciências Químicas, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal
c Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Lisbon, Portugal
University of California, San Diego
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2020 Rodrigues et al. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
ABSTRACT
Genomics-based population analysis of multidrug-resistant (MDR)
IMPORTANCE
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer




