Content area
Introduction
The implementation of advanced technologies and algorithms for diagnosis and genome analysis has made a fundamental contribution to pathogens’ identification and investigation.
Methods
The study of non-tuberculous mycobacteria (NTM) benefited from a next-generation sequencing (NGS) approach, making it possible to describe sequences of rare pathogens. This study identified 20 diagnostically unknown isolates as Mycobacterium saskatchewanense ST 691, an environmental NTM. The isolates were sequenced on two different platforms to compare their throughput and to investigate shared and unique single nucleotide polymorphism (SNP) counts, phylogeny based on concatenated 16S, hsp65, and rpoB genes, and core-genome multilocus sequence typing (MLST), in order to broaden the current knowledge of Mycobacterium saskatchewanense .
Results
Principal component analysis on the three genes combined with the mutations’ annotation suggests that rpoB may serve as a suitable marker to distinguish M. saskatchewanense from other NTM.
Discussion
Our results show that frontier studies performed using NGS can help in overcoming the limits of traditional diagnostic assays and deepen the knowledge on rare and uncommon NTM that are raising clinical concern.
1 Introduction
The term non-tuberculous mycobacteria (NTM) refers to a heterogeneous group that includes various species, except for the ones classified into the Mycobacterium tuberculosis complex or the Mycobacterium leprae. The distribution of NTM is generally ubiquitous (Jiang et al., 2024), and bacteria can be isolated from different environmental reservoirs such as water and water systems, dust, soil, and related matrixes (Bryant et al., 2016;Falkinham, 2021). Given the wide dissemination and the continuous explosion of likely infecting sources, the incidence of NTM is significantly growing worldwide. This trend may jeopardize fragile patients in which NTM blood-stream dissemination leads to organ dysfunction, disorders, and diverse clinical conditions, posing a challenge for the treatment and management of patients (Abad and Razonable, 2016;Nagata et al., 2020). Despite the significant impact on human health, NTM diagnosis still requires well-defined criteria for identification when microbial culture is insufficient (Tomishima et al., 2022). Considering the ease of contracting the infection, an appropriate and accurate diagnostic approach is needed to identify uncommon environmental NTM (Khosravi et al., 2024). Indeed, to achieve a complete diagnosis, clinical, radiographic, and microbiological data must be combined. Up to date, the microbiological gold standard for detecting NTM is the bacterial culture, which is not devoid of deficiencies: in vitro isolation can be time-consuming, influenced by low bacterial burden, and by a wide and diverse growth temperature spectrum for each different NTM (Wang et al., 2022). This is mainly reported in cases of recently discovered NTM infections when the microbial culture has a low rate of positivity, and the microbial agent cannot be clearly distinguished through microscopy. PCR-based techniques and serological assays could be considered alternative tests, but they require a preliminary knowledge of the target microorganism (Huang et al., 2023). More recently, these limitations have been overcome by introducing genotyping methods relying on pulsed-field gel electrophoresis, commercial DNA probes, polymerase chain reaction amplification, and restriction-enzyme analysis. In the same manner, early sequencing methods were based on gene-specific investigation, such as 16S rRNA, rpoB, hsp65, dnaj, soda, and 16S-23S internal transcribed spacer (ITS) (Tenover et al., 1995;Maleki et al., 2017). The step forward in NTM analysis was the application of a multilocus sequence typing (MLST), core genome MLST (cgMLST), and a next-generation sequencing (NGS) approach characterizing a set of concatenated gene sequences to properly identify NTM species (Khosravi et al., 2024). More recently, next-generation whole-genome sequencing (WGS) emerged as the method of choice, as it allows the evaluation of single-nucleotide polymorphisms (SNPs) and global diversity, by defining SNPs associated with drug resistance, phylogenetic diversity, and differentiation of mixed infections (Dohál et al., 2021;Inkster et al., 2021;Lecorche et al., 2021;Davidson et al., 2022). Additionally, WGS data provide a detailed insight into the molecular epidemiology and definition of clusters of infection (Dohál et al., 2021). The accuracy and reliability of the results push the WGS toward a potential diagnostic application that is still under evaluation, but that could be pivotal in many clinical settings (Wang and Xing, 2023). A genome-driven approach provides a broader overview of the pathogen of interest, especially in case of uncommon and rare infections. In this scenario, bacterial isolates obtained from 20 positive blood cultures belonging to 20 individuals were investigated leading to the identification of an environmental NTM, the Mycobacterium saskatchewanense (Turenne et al., 2004). This study aims to highlight the importance of including M. saskatchewanense as a target in diagnostic assays and NTM surveillance in order to avoid misdiagnosing or underestimating the infection.
2 Materials and methods
2.1 Sampling
In 2023, 20 bacterial isolates of diagnostic unidentified NTM were isolated from positive blood cultures sequenced at the Operative Unit of Microbiology of the Greater Romagna Hub Laboratory, Italy. The isolates were recovered from each of the clinical anonymized leftover samples of 20 individuals. Initially, samples were processed during the daily diagnostic routine. Since no precise identification was obtained, the 20 isolates were processed using NGS.
2.2 Identification by hybridization
A first identification was performed by GenoType® Mycobacterium CM (Hain Lifescience, Nehren, Germany) according to the manufacturer’s instructions. Although the manufacturer (Hain Lifescience, Nehren, Germany) indicates not to consider bands that appear faint or show a lower color intensity than the control bands, and given the possible cross-reactions of the MAC complex described for the GenoType® Mycobacterium CM test, the GenoType® Mycobacterium NTM-DR test (Hain Lifescience, Nehren, Germany) was performed as a second-level test.
2.3 MiSeq whole-genome next-generation sequencing and molecular identification
The DNA extraction was performed using the NucleoMag® Pathogen Assay (Macherey-Nagel GmbH & Co. KG., Düren, Germany) designed for nucleic acid extraction and purification through paramagnetic beads. Samples were dosed on a Qubit™ fluorometer (Thermo Fisher Scientific Inc., Waltham, United States), and libraries were prepared using the Illumina DNA Prep kit (Illumina Inc., San Diego, United States). All the samples were sequenced on a MiSeq platform (Illumina Inc., San Diego, United States). Once having downloaded the FastQ files, the quality was checked using FastQC (Babraham Institute, Cambridge, United Kingdom). Since the dubious bands pattern from the first identification level, reads were trimmed and mapped on Mycobacterium tuberculosis reference genomes (NCBI accession number: NC-000962.3) using DNASTAR Lasergene software (DNASTAR Inc., Madison, United States) to confirm the identification in NTM. When identifying the species, the reference genome of Mycobacterium saskatchewanense was included in the analysis (Accession number: GCF_010729105.1) to detect mutations in 16S and hsp65 genes and calculate the Average Nucleotide Identity (ANI).
2.4 NextSeq whole-genome next-generation sequencing
Given the M. saskatchewanense rarity and to avoid error due to the sequencing method of choice, the 15 specimens (M02-23-03, M02-23-04, M03-23-03, M03-23-04, M04-23-01, M04-23-02, M04-23-03, M04-23-04, M05-23-01, M05-23-02, M05-23-03, M05-23-04, M06-23-01, M06-23-02, and M06-23-03) were also loaded on a NextSeq 2000 Illumina sequencer (Illumina Inc., San Diego, United States).
2.5 De novo genome assembly, genome annotation, and data acquisition
After trimming, reads were de novo assembled using SPAdes v.4.0.0 with the –careful option (Prjibelski et al., 2020). The quality of the assemblies was assessed both statistically, using QUAST v.5.2.0 (Mikheenko et al., 2023), and functionally, by evaluating genome completeness and the number of genes and their evolutionary relatedness to similar organisms, using BUSCO v.5.8.2 (Manni et al., 2021a). Only assemblies with a completeness of at least 95% were used for further analysis. To achieve chromosomal-level genome assemblies, scaffolds were reference-based assembled using RagTag v.2.1.0 (Alonge et al., 2022). Annotations of all assemblies were conducted with Prokka v.5.2.0 (Seemann, 2014). Once the 16S and hsp65 genes were identified in the isolates’ genomes, the Basic Local Alignment Search Tool (BLAST) performed the bacterial identification. The MLST was performed on locus L16, L19, L35, S12, S14Z, S19, S8, and S7 and the Sequence Type (ST) was defined on the PubMLST.org website (Jolley et al., 2018;Dinh-Hung et al., 2024). The MiSeq and NextSeq 2000 genome assemblies were checked with ResFinder v.4.6.0 for antibiotic-resistant mutations.
2.6 Cluster and phylogenetic analysis
Since no information is reported in the literature regarding the SNP limit to differentiate two clones of M. saskatchewanense, all samples sequenced on MiSeq and 10 samples (samples M02-23-01, M02-23-03, M02-23-04, M03-23-01, M03-23-02, M03-23-03, M03-23-04, M04-23-01, M04-23-02, and M05-23-04) sequenced on both MiSeq and NextSeq 2000 platforms were selected to define the error rate associated with the sequencing method of choice. Therefore, a cluster cgMLST analysis was performed on 5,711 genes using SeqSphere+ software (Ridom Bioinformatics GmbH, Germany).
To provide a comprehensive analysis, mutations across the complete genome were annotated in 15 samples sequenced on two Illumina platforms (samples M02-23-03, M02-23-04, M03-23-03, M03-23-04, M04-23-01, M04-23-02, M04-23-03, M04-23-04, M05-23-01, M05-23-02, M05-23-03, M05-23-04, M06-23-01, M06-23-02, and M06-23-03). Nucleotide mutation profiles were determined by aligning all sequences against the reference genome of M. saskatchewanense and subsequently annotated using Snipit v.1.6 (O’Toole et al., 2024).
The phylogenetic tree was constructed by concatenating the 16S, hsp65, and rpoB genes and SNPs were annotated. Marker genes obtained from the genome assemblies were aligned using MAFFT v.7.525 (Katoh and Standley, 2013) with the –auto option. The resulting multiple-sequence alignment was examined and visualized with the Molecular Evolutionary Genetics Analysis (MEGA) software v.11.0.13 (Tamura et al., 2021). TrimAL v.1.5rev0 (Capella-Gutiérrez et al., 2009) was then applied to refine the alignment using the gappyout option. Finally, phylogenetic analysis was performed using IQ-TREE v.2.1.4 (Nguyen et al., 2015), employing the maximum likelihood method with ultrafast bootstrap set to 1,000 and automatic selection of the optimal nucleotide substitution model. The resulting tree was visualized with iTOL v.1.0 (Letunic and Bork, 2024). The rapid-growing Mycobacterium abscessus was selected as the outgroup for constructing the phylogenetic tree.
2.7 Principal component analysis
The principal component analysis (PCA) was performed on 16S, hps65, and rpoB genes from sequenced samples and Mycobacterium intracellulare, Mycobacterium lentiflavum, Mycobacterium simiae, Mycobacterium palustre, Mycobacterium heidelbergense, Mycobacterium avium, Mycobacterium scrofulaceum, Mycobacterium intermedium, M. abscessus, and M. saskatchewanense that were divided into k-mer 5 nucleotides long using FastK v.1.1.0 (Myers, 2025) freely available on GitHub. PCA was performed using Stata Statistical Software: Release 18 (STATA, College Station, TX, StataCorp LLC).
2.8 Statistics
The total number of contigs, the total length, and N50, the total number of contigs, the total assembled length, the average coverage, the percentage of C-G bases, and the number of undefined bases obtained from the analysis of MiSeq and NextSeq 2000 were compared to determine the best performance for investigating unknown bacterial agents. The results of the cluster analysis were organized in a hierarchical cluster analysis dendrogram using STATA.
3 Results
3.1 Identification by hybridization
The identification performed with the GenoType® Mycobacterium CM assay revealed a peculiar banding pattern: band 9, typically associated with Mycobacterium intracellulare, was clearly present; bands 11 and 13 were weakly visible and did not match with any other species included in the assay (Figure 1). Similarly, the results of GenoType® Mycobacterium NTM-DR confirmed the identification of M. intracellulare species.
[Image Omitted: See PDF]
3.2 Identification by 16S and hsp65
The 16S sequence was identified in the assembled genomes and confirmed with BLAST: The examination resulted in 100% homology between the reference M. saskatchewanense 16S RNA sequence and the query genomes. The second-level identification inferred from the hsp65 gene confirmed the previous one.
3.3 Whole genome
From the MLST analysis, the locus pattern (L16 369; L19 331; L35 310; S12 318; S14Z 279; S19 278; S8 352; S7 missing) and ST 691 was the same for all the samples. Additionally, no resistant mutations were found in genome assemblies from MiSeq and NextSeq 2000 Illumina platforms. According to the BUSCO analysis, the overall gene score was 99.3% referring to the M. saskatchewanense species for isolates sequenced on the NextSeq 2000 platform, whereas 99.2% was obtained for sequences obtained from MiSeq.
Due to the similarity in the MLST results, the level was deepened to a whole-genome investigation by an ANI approach between the GCF_010729105.1 and genome assemblies. ANI results showed that the samples’ diversity from the reference genome ranged from 0.04% to 0.035%, whereas the differences among specimens varied from 0.001% to 0.006%. In the end, the identity between the reference sequence and the query genomes was approximately 96%; on the other hand, the sample sequence identity was approximately 99.9%. Interestingly, sample M03-23–03 had a lower percentage of identity when compared with the different samples (99.5%) (Supplementary Figure S1).
According to the phylogenetic analysis, the 20 isolate sequences on the MiSeq platform clustered as related species (Figure 2).
[Image Omitted: See PDF]
Since no phylogenetic distance is reported in the literature to distinguish different clones for M. saskatchewanense, a cluster analysis graph was generated for the 20 samples sequenced on the MiSeq Illumina platform using the cgMLST. A total of 592 alleles were excluded from the study because they were not represented in the genome assemblies. The cgMLST showed that samples having one different allele were M06-23-03/M06-23-02, M02-23-03/M03-23-04, M05-23-03/M06-3-01, and M05-23-01/M01-23-04. Conversely, samples M02-23–01 and M03-23–03 resulted in an average of 18.5 and 25.6 allele differences, respectively, when compared with other isolates, making them the most divergent within the dataset. Considering the whole tested population, the average number of different alleles between isolates is 12.3 (Supplementary Figure S2).
3.4 MiSeq and NextSeq chemistry comparison
To evaluate the impact of the two Illumina platforms implied in the study, the assembly metrics of the 15 samples were analyzed to define which platform could be more feasible to sequence uncommon mycobacteria. The parameters considered were N50, the total number of contigs, the total assembled length, the average coverage, the percentage of C-G bases, and the number of undefined bases (Supplementary Table S1). Given the different sequencing chemistry of MiSeq and NextSeq 2000, the number of mutations in the complete genome was compared, resulting in around 160,000 SNPs for both platforms. Although NextSeq 2000 annotated a higher number of SNPs, in three cases (samples M02-23-03, M04-23-01, and M05-23-02), the MiSeq assemblies contained more mutations For samples M04-23-02, M04-23-03, M04-23-04, M05-23-01, and M05-23-03, the percentages of reported SNPs were comparable between platforms (Figure 3).
[Image Omitted: See PDF]
When examining SNP count differences, NextSeq 2000 consistently annotated more mutations overall, with an average discrepancy of approximately 95 SNPs in the whole genome. The unique mutations for each sample were also considered. The majority relied on AG, CG, CT, GA, and GC SNPs, whereas AT and TA were less frequent. Non-shared SNPs were distributed equally in the genome of samples M02-23-01, M03-23-01, M03-23-04, M04-23-01, M04-23-02, M05-23-04, M06-23-01, and M06-23-03, whereas mutations located at the ending positions—from 105 to 106 genome locations—in samples M02-23-03, M04-23-03, M04-23-04, M05-23-01, M05-23-02, M05-23-03, and M06-23-02. Notably, the mutations reported in MiSeq assemblies were located mainly in the ending genome positions (Supplementary Figure S3). A cgMLST complete concordance was obtained for the sample M05-23-04. Samples M04-23–02 and M03-23–01 were found to have one different allele if sequenced on different platforms, whereas M02-23–04 sequenced on the MiSeq platform resulted in having two different alleles from its correspondence on NextSeq 2000. Interestingly, sample M02-23–03 differed from M03-23–04 sequenced on MiSeq by one allele, two alleles from M03-23–04 sequenced on NextSeq 2000, and three alleles from M02-23–03 sequenced on NextSeq 2000. In the same manner, sample M03-23–04 has one allele different from M02-23–03 sequenced on MiSeq, two different alleles from its NextSeq 2000 correspondence, and four different alleles from M03-23–01 and M04-23–02 sequenced on MiSeq (Supplementary Figure S4).
From the cgMLST, the mean rate of detected different alleles in each sample was calculated for both sequencers. Each sample was compared pairwise, resulting in 45 total comparisons (10 samples per nine comparisons each). The mean allele difference for each sample detected from MiSeq is 15.55, whereas 15.73 was calculated for NextSeq 2000. Among the 45 comparisons, NextSeq 2000 identified more allele differences in 25 cases, whereas MiSeq reported more differences in 16 cases. Complete concordance was observed in four comparisons (samples M04-23-01/M02-23-03, M02-23-01/M03-23-02, M02-3-01/M03-23-03, and M03-23-04/M03-23-03). Therefore, considering the difference between the sequencers, the error rate associated with the technique is +/− 1.8 alleles. Regarding phylogenetic diversity, using two different sequencing techniques has been observed not to impact phylogenetic diversity (Supplementary Figures S5).
3.5 Principal component analysis and SNP annotation
The PCA performed on 16S, hsp65, and rpoB highlighted that the latter gene can be used to differentiate M. saskatchewanense from other NTM species. Although both 16S and rpoB obtained from the 20 samples clustered together and closely to the M. saskatchewanense reference genome, rpoB showed a tighter association and a clear separation from other NTM sequences (Figure 4).
[Image Omitted: See PDF]
SNP analysis of the three genes confirmed that rpoB offers a more distinct clustering of M. saskatchewanense compared with 16S and hsp65 (Figure 5;Supplementary Figure S6).
[Image Omitted: See PDF]
4 Discussion
Accurate identification of infectious agents is crucial for diagnosis, epidemiological investigations, and clinical intervention The presence of multiple species belonging to the same genus or complex can further complicate diagnostic interpretation and hinder the reconstruction of infection patterns (Moghaddam et al., 2022). In the field of mycobacteria research, NTMs were long considered environmental contaminants and remain largely neglected (Gcebe and Hlokwe, 2017). More recently, NTMs have increasingly been reported as infectious agents in immunocompromised patients, and novel Mycobacterium strains have been discovered (Iversen et al., 2025). Traditional microbial culture showed several limitations; therefore, the introduction of PCR-based techniques, NGS, and algorithms of analysis expanded the knowledge of uncommon NTMs, overcoming the limit for clinical and laboratory applications (Ghodousi et al., 2023;Huang et al., 2023). Indeed, technical challenges in NTM identification are well-reported in the literature and often delay the diagnosis and increase the costs due to prolonged hospitalizations and treatment (Lee et al., 2023). Sequencing of conserved regions such as 16S and hsp65 genes has proven valuable in supporting the interpretation of diagnostic tests, particularly in ambiguous cases (Tortoli, 2014;Solaghani et al., 2024). More recently, NGS and WGS have improved laboratory and diagnostic capabilities, enabling the analysis of hundreds of NTM genomes and providing a comprehensive approach for investigating genetic diversity and antibiotic resistance. The application of WGS in clinical settings may therefore help overcome current diagnostic and culture-associated limitations (He et al., 2020). Given the previously uncertain diagnostic results, the authors sought to investigate the genomic diversity of 20 M. saskatchewanense isolates to expand the scientific knowledge about this rare and recent NTM species (Turenne et al., 2004). To achieve this, an MLST approach was used as a first-level method to characterize M. saskatchewanense and investigate evolution patterns (Wuzinski et al., 2019). All isolates were classified in ST 691. To the authors’ knowledge, no specific MLST pattern or ST had been previously reported for M. saskatchewanense. Interestingly, allele S7 was absent in all the isolates, suggesting the need for further studies on the genomic organization of M. saskatchewanense to define possible alternative locus sequences. This represents the first report describing the MLST pattern and ST of M. saskatchewanense. Nevertheless, the analysis of multiple genes through WGS remains the preferred approach for resolving the phylogeny of unknown or poorly studied NTM (Fedrizzi et al., 2017). For environmental NTMs, a precise identification and characterization of closely related species is not always possible and genomic studies often lack a standardized methodology. Therefore, the investigation on genetic diversity and the impact of the NGS technology are still an open debate (Fedrizzi et al., 2017;Mugetti et al., 2021). To distinguish clonal populations, the cgMLST was applied in mycobacteria studies, completing the MLST approach and achieving highly consistent results in phylogenetic analysis (Menghwar et al., 2022). In our study, the cgMLST cluster analysis identified approximately 97.8% of M. saskatchewanense-specific genes across the 20 genome assemblies (Manni et al., 2021b). This result also correlated with the percentage of whole genome identity between the isolates and the reference sequence. Unlike the 98% identity that was used for other NTM (Behra et al., 2022), M. saskatechewanese isolates can be classified as the same species also when the percentage of identity results in nearly 96% (Turenne, 2019). The concatenated analysis of 16S, hsp65, and rpoB genes performed on MiSeq and NextSeq 2000 genome assemblies revealed no differences in phylogenetic tree arrangement, confirming the robustness of the method. The application of PCA to mass spectrometry spectra has already been used in mycobacteria identification, proving to be a valid method to differentiate between species and improve accuracy in classification (Kehrmann et al., 2016;Hanson et al., 2017). Following this rationale, in our study, PCA was applied to genome sequences to further investigate the clustering patterns observed through concatenated genes phylogenetics. PCA based on 16S and rpoB genes successfully clustered different NTM species, whereas hsp65 resulted in a more scattered distribution. Interestingly, the PCA results were supported by the SNP analysis which confirmed rpoB as the more suitable gene for distinguishing M. saskatchewanense. This suggests that, in case of ambiguous diagnostic results for M. intracellulare, a molecular confirmation test targeting rpoB can be implemented. The sequence diversity between the reference genome and the isolates resulted in approximately 3%, consistent with the literature (Turenne, 2019). The cgMLST results aligned to the phylogenetic data, suggesting high concordance between the two approaches. Since the molecular surveillance of M. tuberculosis is commonly performed using cgMLST to ensure inter-laboratory comparability (Diricks et al., 2022), our findings confirm that the same molecular method can be applied to M. saskatchewanense. Assessing the number and distribution of SNPs across the whole genome is necessary for proper genomic characterization. This study provides a proof of concept on how a sequencer’s chemistry can affect the final molecular epidemiological results. NextSeq 2000 generally reveals mutations along the whole genome, whereas MiSeq concentrates SNPs at the end of the sequence. NextSeq 2000 also recorded a higher overall SNP count, with an estimated average technical variation of approximately 1.8 alleles between platforms. The higher efficiency of NextSeq 2000 was confirmed by comparing the assembly outputs with the ones obtained from MiSeq (Browne et al., 2020). Despite the similar C-G content, the NextSeq 2000 assemblies presented fewer contigs and increased N50 (Gurevich et al., 2013). The SNPs’ analysis of the 15 samples sequenced with both platforms corroborated the previous data: NextSeq 2000 revealed approximately 95 SNPs per sample more than MiSeq across the whole genome. Focusing on the unique SNPs, mutations were equally distributed across the genome using the NextSeq 2000 approach. This study is relevant to the Italian context, where the environmental and clinical manifestations of M. saskatchewanense have been reported in recent years. In 2019, the first Italian infection of M. saskatchewanense was documented in a solid-organ transplant patient with chronic renal disease (Mento et al., 2019). Subsequently, M. saskatchewanense was identified at the environmental level in 17 dialysis fluids from the Emilia-Romagna region (Bisognin et al., 2023). This draws attention to the lack of proper surveillance for environmental contamination. A second Italian study screened 722 ultrapure dialysis fluid samples in the same region and detected 35 positive cases. The isolates were then sequenced and analyzed, highlighting the importance of dedicated environmental screening and disinfection (Cannas et al., 2024). Our findings are consistent with these reports and strongly support the inclusion of M. saskatchewanense in diagnostic and surveillance panels. Moreover, the growing concern of NTM and M. saskatchewanense understanding was also confirmed by the introduction of the pathogen in one genomic comparison analysis for NTM identification (van Ingen et al., 2009). The identification of proper methodology and threshold proposed in this analysis may improve the diagnostic interpretation. The application of WGS in genomic epidemiology remains under discussion. Collaborative efforts between laboratories are recommended to integrate genomic data for European surveillance and preparedness strategies (Struelens and Brisse, 2013). Conor et al. highlighted the need to validate and standardize WGS protocols, bioinformatic pipelines, and phylogenetic analysis for M. tuberculosis investigation in both clinical and public health settings (Meehan et al., 2019). Similar standardization is essential also for NTM. Our study can contribute to this effort to define technical limitations associated with current sequencing chemistries. Finally, given the limited availability of M. saskatchewanense complete genomes in public databases, our study contributes by depositing raw data of 20 isolates that can be reassembled and compared with further investigations. Continued pathogen discovery and genome characterization expand the knowledge and improve molecular technique applications. To date, scientific literature on NTMs remains limited; therefore, this study attempted to introduce a new perspective on the diagnostics and genomic analysis of the M. saskatchewanense. Library preparation protocols and bioinformatic pipelines should be implemented and standardized to advance NTM research. Genomic analysis provides a foundation for the development of diagnostic assays aimed at preventing misdiagnosis and strengthening the surveillance of rare and atypical NTM species.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below:https://www.ncbi.nlm.nih.gov/, PRJNA970246.
Author contributions
GG: Software, Methodology, Investigation, Conceptualization, Writing – original draft, Formal analysis. LI: Conceptualization, Investigation, Software, Writing – original draft, Formal analysis. GD: Conceptualization, Validation, Methodology, Writing – review & editing. SZ: Conceptualization, Writing – review & editing, Validation, Methodology. FT: Writing – review & editing, Resources. CC: Resources, Writing – review & editing. LD: Resources, Writing – review & editing. AM: Visualization, Writing – review & editing. MM: Writing – review & editing, Visualization. AD: Investigation, Writing – review & editing. FC: Investigation, Writing – review & editing. LG: Writing – review & editing, Data curation. MB: Visualization, Writing – review & editing. MG: Writing – review & editing, Visualization. AD: Data curation, Writing – review & editing. AS: Writing – review & editing. MC: Funding acquisition, Supervision, Writing – review & editing. VS: Supervision, Funding acquisition, Writing – review & editing, Project administration.
Acknowledgments
The authors thank Dr. Michele Proietto of the Department of Civil, Chemical, Environmental, and Materials Engineering at University of Bologna, for his invaluable support throughout the research process. The HPC resources offered were extremely helpful and have been instrumental in completing this research project successfully. The authors acknowledge the CINECA award under the ISCRA initiative, for the availability of high-performance computing resources and support, which enabled part of the analyses conducted in this study.
Conflict of interest
The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at:https://www.frontiersin.org/articles/10.3389/fcimb.2025.1685898/full#supplementary-material
Supplementary Table 1
assembly parameters of the samples sequenced on MiSeq and NextSeq 2000 Illumina platform. The parameters considered in the analysis are the number of contigs, N50, total assembled length, the average coverage, the percentage of C-G bases, and the number of undefined bases. #: number.
AbadC. L.RazonableR. R.(2016).Non-tuberculous mycobacterial infections in solid organ transplant recipients: An update.J. Clin. Tuberc Other Mycobact Dis.4,1–8. doi:10.1016/j.jctube.2016.04.001, PMID:31723683
AlongeM.LebeigleL.KirscheM.JenikeK.OuS.AganezovS.. (2022).Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing.Genome Biol.23,258. doi:10.1186/s13059-022-02823-7, PMID:36522651
BehraP. R. K.PetterssonB. M. F.RameshM.DasS.DasguptaS.KirsebomL. A.(2022).Comparative genome analysis ofmycobacteriafocusing on tRNA and non-coding RNA.BMC Genomics.23,704. doi:10.1186/s12864-022-08927-5, PMID:36243697
BisogninF.FerraroV.SorellaF.LombardiG.LazzarottoT.Dal MonteP.(2023).First isolation ofMycobacterium saskatchewanensefrom medical devices.Sci. Rep.13,21628. doi:10.1038/s41598-023-48974-w, PMID:38062133
BrowneP. D.NielsenT. K.KotW.AggerholmA.GilbertM. T. P.PuetzL.. (2020).GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms.Gigascience.9,giaa008. doi:10.1093/gigascience/giaa008, PMID:32052832
BryantJ. M.GrogonoD. M.Rodriguez-RinconD.EverallI.BrownK. P.MorenoP.. (2016).Emergence and spread of a human-transmissible multidrug-resistant nontuberculous mycobacterium.Science.354,751–757. doi:10.1126/science.aaf8156, PMID:27846606
CannasA.MessinaF.Dal MonteP.BisogninF.DiraniG.ZannoliS.. (2024).Sanitary waters: is it worth looking forMycobacteria?Microorganisms.12,1953. doi:10.3390/microorganisms12101953, PMID:39458263
Capella-GutiérrezS.Silla-MartínezJ. M.GabaldónT.(2009).trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses.Bioinformatics.25,1972–1973. doi:10.1093/bioinformatics/btp348, PMID:19505945
DavidsonR. M.NickS. E.KammladeS. M.VasireddyS.WeaklyN.HasanN. A.. (2022).Genomic analysis of a hospital-associated outbreak ofMycobacterium abscessus: implications on transmission.J. Clin. Microbiol.60,e0154721. doi:10.1128/JCM.01547-21, PMID:34705540
Dinh-HungN.MwamburiS. M.DongH. T.RodkhumC.MeemettaW.LinhN. V.. (2024).Unveiling Insights into the Whole Genome Sequencing ofMycobacteriumspp. Isolated from Siamese Fighting Fish (Betta splendens).Animals.14,2833. doi:10.3390/ani14192833, PMID:39409782
DiricksM.MerkerM.WetzsteinN.KohlT. A.NiemannS.MaurerF. P.(2022).DelineatingMycobacterium abscessuspopulation structure and transmission employing high-resolution core genome multilocus sequence typing.Nat. Commun.13,4936. doi:10.1038/s41467-022-32122-5, PMID:35999208
DohálM.PorvazníkI.SolovičI.MokrýJ.(2021).Whole genome sequencing in the management of non-tuberculous mycobacterial infections.Microorganisms.9,2237. doi:10.3390/microorganisms9112237, PMID:34835363
FalkinhamJ. O.(2021).Ecology of nontuberculous mycobacteria.Microorganisms.9,2262. doi:10.3390/microorganisms9112262, PMID:34835388
FedrizziT.MeehanC. J.GrottolaA.GiacobazziE.Fregni SerpiniG.TagliazucchiS.. (2017).Genomic characterization of nontuberculous mycobacteria.Sci. Rep.7,45258. doi:10.1038/srep45258, PMID:28345639
GcebeN.HlokweT. M.(2017).Non-tuberculousMycobacteriain South African wildlife: neglected pathogens and potential impediments for bovine tuberculosis diagnosis.Front. Cell Infect. Microbiol.7,15. doi:10.3389/fcimb.2017.00015, PMID:28194371
GhodousiA.Darban-SarokhalilD.ShahrakiA. H.ShanmugamS.(2023).Editorial: Genomics-based strategies for advanced drug resistance and epidemiological surveillance ofMycobacterium tuberculosisand other non-tuberculous mycobacteria.Front. Microbiol.14,1264285. doi:10.3389/fmicb.2023.1264285, PMID:37680533
GurevichA.SavelievV.VyahhiN.TeslerG.(2013).QUAST: quality assessment tool for genome assemblies.Bioinformatics.29,1072–1075. doi:10.1093/bioinformatics/btt086, PMID:23422339
HansonC.SievertsM.VargisE.(2017).Effect of principal component analysis centering and scaling on classification ofMycobacteriafrom Raman spectra.Appl. Spectrosc.71,1249–1255. doi:10.1177/0003702816678867, PMID:27888200
HeY.GongZ.ZhaoX.ZhangD.ZhangZ.(2020).Comprehensive determination ofMycobacterium tuberculosisand nontuberculousMycobacteriafrom targeted capture sequencing.Front. Cell Infect. Microbiol.10,449. doi:10.3389/fcimb.2020.00449, PMID:32984073
HuangY. Y.LiQ. S.LiZ. D.SunA. H.HuS. P.(2023).Rapid diagnosis ofMycobacterium marinuminfection using targeted nanopore sequencing: a case report.Front. Cell Infect. Microbiol.13,1238872. doi:10.3389/fcimb.2023.1238872, PMID:37965260
InksterT.PetersC.SeagarA. L.HoldenM. T. G.LaurensonI. F.(2021).Investigation of two cases ofMycobacterium chelonaeinfection in haemato-oncology patients using whole-genome sequencing and a potential link to the hospital water supply.J. Hosp. Infection.114,111–116. doi:10.1016/j.jhin.2021.04.028, PMID:33945838
IversenX. E. S.RasmussenE. M.FolkvardsenD. B.SvenssonE.MeehanC. J.JørgensenR.. (2025).Four novel nontuberculousmycobacteriaspecies:Mycobacterium wendilensesp. nov.,Mycobacterium burgundiensesp. nov.,Mycobacterium kokjenseniisp. nov. andMycobacterium holstebronensesp. nov. revived from a historical Danish strain collection.Int. J. Syst. Evol. Microbiol.75,006620. doi:10.1099/ijsem.0.006620, PMID:39773688
JiangX.XueY.MenP.ZhaoL.JiaJ.YuX.. (2024).Nontuberculous mycobacterial disease in children: A systematic review and meta-analysis.Heliyon.10,e31757. doi:10.1016/j.heliyon.2024.e31757, PMID:38845977
JolleyK. A.BrayJ. E.MaidenM. C. J.(2018).Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications.Wellcome Open Res.3,124. doi:10.12688/wellcomeopenres.14826.1, PMID:30345391
KatohK.StandleyD. M.(2013).MAFFT multiple sequence alignment software version 7: improvements in performance and usability.Mol. Biol. Evol.30,772–780. doi:10.1093/molbev/mst010, PMID:23329690
KehrmannJ.WesselS.MuraliR.HampelA.BangeF. C.BuerJ.. (2016).Principal component analysis of MALDI TOF MS mass spectra separates M. abscessus (sensu stricto) fromM. massilienseisolates.BMC Microbiol.16,24. doi:10.1186/s12866-016-0636-4, PMID:26926762
KhosraviA. D.MeghdadiH.HashemzadehM.AlamiA.TabandehM. R.(2024).Application of a new designed high resolution melting analysis for mycobacterial species identification.BMC Microbiol.24,205. doi:10.1186/s12866-024-03361-x, PMID:38851713
LecorcheE.DaniauC.LaK.MougariF.BenmansourH.KumanskiS.. (2021).Mycobacterium chimaeraGenomics With Regard to Epidemiological and Clinical Investigations Conducted for an Open Chest PostsurgicalMycobacterium chimaeraInfection Outbreak.Open Forum Infect. Dis.8,ofab192. doi:10.1093/ofid/ofab192, PMID:34189167
LeeS. W.ChangS.ParkY.KimS.SohnH.KangY. A.(2023).Healthcare use and medical cost before and after diagnosis of nontuberculous mycobacterial infection in Korea: the National Health Insurance Service-National Sample Cohort Study.Ther. Adv. Respir. Dis.17,17534666221148660. doi:10.1177/17534666221148660, PMID:36800913
LetunicI.BorkP.(2024).Interactive Tree of Life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool.Nucleic Acids Res.52,W78–W82. doi:10.1093/nar/gkae268, PMID:38613393
MalekiM. R.KafilH. S.HarzandiN.MoaddabS. R.(2017).Identification of nontuberculousmycobacteriaisolated from hospital water by sequence analysis of the hsp65 and 16S rRNA genes.J. Water Health.15,766–774. doi:10.2166/wh.2017.046, PMID:29040079
ManniM.BerkeleyM. R.SeppeyM.SimãoF. A.ZdobnovE. M.(2021a).BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes.Mol. Biol. Evol.38,4647–4654. doi:10.1093/molbev/msab199, PMID:34320186
ManniM.BerkeleyM. R.SeppeyM.ZdobnovE. M.(2021b).BUSCO: assessing genomic data quality and beyond.Curr. Protoc.1,e323. doi:10.1002/cpz1.323, PMID:34936221
MeehanC. J.GoigG. A.KohlT. A.VerbovenL.DippenaarA.EzewudoM.. (2019).Whole genome sequencing ofMycobacterium tuberculosis: current standards and open issues.Nat. Rev. Microbiol.17,533–545. doi:10.1038/s41579-019-0214-5, PMID:31209399
MenghwarH.GuoA.ChenY.LysnyanskyI.ParkerA. M.PrysliakT.. (2022).A core genome multilocus sequence typing (cgMLST) analysis ofMycoplasma bovisisolates.Vet. Microbiol.273,109532. doi:10.1016/j.vetmic.2022.109532, PMID:35987183
MentoG. D.CarrecaA. P.MonacoF.CuscinoN.CardinaleF.ConaldiP. G.. (2019).Mycobacterium saskatchewanensestrain associated with a chronic kidney disease patient in an Italian transplantation hospital and almost misdiagnosed asMycobacterium tuberculosis.Infection Control Hosp. Epidemiol.40,496–497. doi:10.1017/ice.2019.6, PMID:30767828
MikheenkoA.SavelievV.HirschP.GurevichA.(2023).WebQUAST: online evaluation of genome assemblies.Nucleic Acids Res.51,W601–W606. doi:10.1093/nar/gkad406, PMID:37194696
MoghaddamS.NojoomiF.Dabbagh MoghaddamA.MohammadimehrM.SakhaeeF.MasoumiM.. (2022).Isolation of nontuberculousmycobacteriaspecies from different water sources: a study of six hospitals in Tehran, Iran.BMC Microbiol.22,261. doi:10.1186/s12866-022-02674-z, PMID:36309645
MugettiD.TomasoniM.PastorinoP.EspositoG.MenconiV.DondoA.. (2021).Gene sequencing and phylogenetic analysis: powerful tools for an improved diagnosis of fish mycobacteriosis caused byMycobacterium fortuitumgroup members.Microorganisms.9,797. doi:10.3390/microorganisms9040797, PMID:33920196
MyersE. W.Jr.(2025).thegenemyers/FASTK. Available online at:https://github.com/thegenemyers/FASTK(AccessedMay 20, 2025).
NagataA.SekiyaN.NajimaY.HoriuchiM.FukushimaK.ToyaT.. (2020).Nontuberculous mycobacterial bloodstream infections after allogeneic hematopoietic stem cell transplantation.Int. J. Infect. Dis.97,131–134. doi:10.1016/j.ijid.2020.05.079, PMID:32474198
NguyenL. T.SchmidtH. A.von HaeselerA.MinhB. Q.(2015).IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.Mol. Biol. Evol.32,268–274. doi:10.1093/molbev/msu300, PMID:25371430
O’TooleÁAzizA.MaloneyD.(2024).Publication-ready single nucleotide polymorphism visualization with snipit.Bioinformatics.40,btae510. doi:10.1093/bioinformatics/btae510, PMID:39137137
PrjibelskiA.AntipovD.MeleshkoD.LapidusA.KorobeynikovA.(2020).Using SPAdesde novoassembler.Curr. Protoc. Bioinf.70,e102. doi:10.1002/cpbi.102, PMID:32559359
SeemannT.(2014).Prokka: rapid prokaryotic genome annotation.Bioinformatics.30,2068–2069. doi:10.1093/bioinformatics/btu153, PMID:24642063
SolaghaniT. H.NazariR.MosavariN.TadayonK.ZolfaghariM. R.(2024).Isolation and identification of nontuberculousmycobacteriafrom raw milk and traditional cheese based on the 16S rRNA and hsp65 genes, Tehran, Iran.Folia Microbiol.69,81–89. doi:10.1007/s12223-023-01073-9, PMID:37507582
StruelensM. J.BrisseS.(2013).From molecular to genomic epidemiology: transforming surveillance and control of infectious diseases.Eurosurveillance.18,20386. doi:10.2807/ese.18.04.20386-en, PMID:23369387
TamuraK.StecherG.KumarS.(2021).MEGA11: molecular evolutionary genetics analysis version 11.Mol. Biol. Evol.38,3022–3027. doi:10.1093/molbev/msab120, PMID:33892491
TenoverF. C.ArbeitR. D.GoeringR. V.MickelsenP. A.MurrayB. E.PersingD. H.. (1995).Interpreting chromosomal DNA restriction patterns produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing.J. Clin. Microbiol.33,2233–2239. doi:10.1128/jcm.33.9.2233-2239.1995, PMID:7494007
TomishimaY.UrayamaK. Y.KitamuraA.OkafujiK.JintaT.NishimuraN.. (2022).Bronchoscopy for the diagnosis of nontuberculous mycobacterial pulmonary disease: Specificity and diagnostic yield in a retrospective cohort study.Respir. Invest.60,355–363. doi:10.1016/j.resinv.2021.11.012, PMID:34998716
TortoliE.(2014).Microbiological features and clinical relevance of new species of the genusMycobacterium.Clin. Microbiol. Rev.27,727–752. doi:10.1128/CMR.00035-14, PMID:25278573
TurenneC. Y.(2019).Nontuberculousmycobacteria: Insights on taxonomy and evolution.Infection Genet. Evolution.72,159–168. doi:10.1016/j.meegid.2019.01.017, PMID:30654178
TurenneC. Y.ThibertL.WilliamsK.BurdzT. V.CookV. J.WolfeJ. N.. (2004).Mycobacterium saskatchewanensesp. nov., a novel slowly growing scotochromogenic species from human clinical isolates related toMycobacterium interjectumand Accuprobe-positive forMycobacterium aviumcomplex.Int. J. Syst. Evol. Microbiol.54,659–667. doi:10.1099/ijs.0.02739-0, PMID:15143004
van IngenJ.LindeboomJ. A.HartwigN. G.de ZwaanR.TortoliE.DekhuijzenP. N. R.. (2009).Mycobacterium manteniisp. nov., a pathogenic, slowly growing, scotochromogenic species.Int. J. Syst. Evol. Microbiol.59,2782–2787. doi:10.1099/ijs.0.010405-0, PMID:19625425
WangS.XingL.(2023).Metagenomic next-generation sequencing assistance in identifying non-tuberculous mycobacterial infections.Front. Cell Infect. Microbiol.13,1253020. doi:10.3389/fcimb.2023.1253020, PMID:37719673
WangJ.XuH.WangX.LanJ.(2022).Rapid diagnosis of non-tuberculous mycobacterial pulmonary diseases by metagenomic next-generation sequencing in non-referral hospitals.Front. Cell Infect. Microbiol.12,1083497. doi:10.3389/fcimb.2022.1083497, PMID:36760234
WuzinskiM.BakA. K.PetkauA.B DemczukW. H.SoualhineH.SharmaM. K.(2019).A multilocus sequence typing scheme forMycobacterium abscessuscomplex (MAB-multilocus sequence typing) using whole-genome sequencing data.Int. J. Mycobacteriol.8,273–280. doi:10.4103/ijmy.ijmy_106_19, PMID:31512604
© 2025. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.