Amplicon sequencing detects, identifies, and

Full text

Turn on search term navigation

INTRODUCTION

Species of Cryptosporidium, a genus of enteric protozoan pathogens, are leading causes of waterborne diarrheal diseases in humans and domesticated animals worldwide (1–4). Both the Global Enteric Multicenter Study (GEMS) and the Study of Malnutrition and Enteric Diseases showed that cryptosporidiosis is the major cause of morbidity and mortality in children under five globally, leading to an estimated 48,000 deaths annually (1, 5, 6). Infection typically occurs by consuming contaminated food or water. In low- and middle-income countries, endemic exposure derives from poor sanitation and hygiene (7, 8). Cryptosporidium also continues to pose a substantial public health challenge in affluent nations, causing water or foodborne outbreaks and direct transmission within daycare facilities, hospitals, and other institutional settings (9–11). Cryptosporidium oocysts are highly resistant to the chlorine-based disinfectants commonly used in water treatment and pose a challenge for standard filtration and water management (12). Therefore, high-income countries experience outbreaks from contaminated recreational and drinking water (13).

There are over 45 recognized species and more than 100 genotypes of Cryptosporidium (14, 15). While nearly 20 species have been reported in humans, most infections are caused by C. parvum and C. hominis (15, 16). Species identification typically employs the small ribosomal subunit (SSU, 18S rRNA) gene (17, 18). These loci include both highly conserved regions amenable to universal primer design and variable regions capable of distinguishing between species. However, 18S rDNA copies in some genomes evolve through a birth-and-death process, leading to sequence divergence among copies (paralogs). Some genomes, including Cryptosporidium parvum, possess more than one divergent 18S rDNA copy distributed across multiple chromosomes. Although various species of Cryptosporidium harbor multiple paralogs of 18S rDNA and other loci such as actin, heat shock protein 70 (HSP70), and cell outer wall protein have been utilized for species identification, 18S rDNA remains the most widely used marker for Cryptosporidium species identification, due to the presence of multiple copies, which enhances the sensitivity of the detection system and significantly enriches the database and diversity within the gene sequences, thereby aiding in the resolution of closely related species (19, 20).

Although the transmission modes are different between C. parvum (zoonotic) and C. hominis (anthroponotic), these species are closely related, sharing approximately 95% DNA sequence identity. They also share high-sequence conservation with other human-infecting species in the C. parvum cluster, including C. meleagridis and C. cuniculus (16). This similarity can make species identification difficult, particularly when species identification is solely based on 18S rRNA genotyping using the Sanger sequencing method. Further compounding this, the targeted Sanger sequencing method may entirely miss low-abundance genotypes in mixed infections (21).

Co-infections with multiple species of Cryptosporidium commonly occur, especially in highly endemic regions (22–24). Co-infections frequently occur in livestock. Mixed C. parvum infections were widely reported in Uganda (25). Subsequently, Grinberg et al. showed an unprecedented intra-host genetic diversity, using HSP70 and gp60 typing of two Cryptosporidium-positive fecal specimens collected in New Zealand. Next-generation sequencing (NGS) identified these mixtures in samples from which only one allele per locus had been identified using Sanger sequencing (26, 27).

Early Cryptosporidium genotyping relied on restriction fragment length polymorphism digestion of 18S rRNA, producing species-specific banding patterns (28–30). This time-consuming method often required additional sequencing steps to distinguish among closely related species. Other methods utilize 18S rRNA amplification in conjunction with Sanger sequencing or quantitative PCR (qPCR) assays targeted at species-specific genes or regions within the 18S rRNA gene (14, 19). These methods only poorly resolve mixed infections of closely related species.

NGS platforms generate a vast number of DNA sequence reads from any given sequencing reaction, allowing greater means to identify mixtures when they occur and facilitate high-throughput parallelization of sequencing (31–33). As a result, NGS provides means to accurately estimate the genetic diversity of Cryptosporidium species within a host. Here, we propose an amplicon sequencing approach targeting the V3/V4 region of the 18S rRNA gene. We utilize the DADA2 analysis pipeline in conjunction with a custom, curated Cryptosporidium 18S database for comprehensive species identification (34). We demonstrate this approach’s sensitivity and its ability to accurately identify a broad range of species, as well as to identify novel species and characterize the relative abundance of mixture constituents. We then demonstrate the application of this approach on veterinary and human samples, illustrating distinct patterns of transmission thereby resolved.

MATERIALS AND METHODS

Sample collection

Rabbit stool sample collection

Between November 2022 and April 2023, a total of 230 fresh fecal samples were collected from 12 rabbit farms randomly selected in the Dakahlia governorate in Northern Egypt. These small farms each housed 300–600 rabbits. Fecal samples were randomly collected from at least 25% of the rabbit cages on each farm, each sample consisting of five to seven fresh fecal pellets from a given cage housing four to six rabbits. Fresh fecal pellets were then placed in sterile disposable plastic bags labeled with the age and breed of the animals (Flander, Baladi, Rex, Chinchilla, New Zealand, and Hi-Plus), as well as the date of sample collection and location. At the time of sample collection, the animals appeared to be healthy. The rabbits ranged in age from 2 to 12 months and were divided into three age groups: ≤3–6 months old, >6–9 months old, and >9–12 months old.

Clinical stool sample collection

From May to August 2023, 330 stool samples were gathered from children in northern Egypt experiencing diarrhea. Each person contributed one sample. The samples were obtained from four governorates: Dakahlia (n = 185), Damietta (n = 24), Gharbia (n = 71), and Sharqia (n = 50), reflecting regional variation in volunteer participation. These governorates were chosen for their proximity to the Mansoura University to facilitate sample collection and storage. Before participating in the study, the parents or guardians of the children provided informed consent after being fully briefed on the study’s objectives. Once they consented, each parent or guardian completed a comprehensive questionnaire to gather epidemiological data, including gender, age, source of stool sample (hospital or clinic laboratory), contact with animals, and residency.

In a sterile tube, approximately 2 g of fresh stool from each patient was mixed with 2 mL of 2.5% potassium dichromate. The mixture was homogenized thoroughly before storage at 4°C. Subsequently, samples were shipped for further processing to the Animal Parasitic Diseases Laboratory in Maryland, USA (Permit 20230214-0695A).

Molecular detection of Cryptosporidium

Total fecal DNA was isolated from 200 mg of stool using the DNeasy Powersoil Pro Kit (Qiagen, Hilden, Germany), according to the manufacturer’s instructions. Extracted total genomic DNA was screened for Cryptosporidium using real-time qPCR targeting the 18S rRNA gene using modified primers and probe set from Stroup et al. (GEMS Crypto 18S; Table S1) (35). Amplifications were carried out with 2 µL of extracted DNA in a total volume of 20 µL using a QuantStudio Flex 7. Reaction setup and cycling conditions can be found in Supplemental methods. A red channel (cyanine 5) collected data during the annealing/extension phase. The autothreshold function was utilized to calculate the threshold cycles (C_T) for each reaction using QuantStudio Real-Time PCR software program (version 1.7.2). Samples were considered positive if their C_T value was less than 40. Positive samples were tested in triplicate.

Primer design for 18S Cryptosporidium species identification

Positive samples were selected for Cryptosporidium species identification by 18S amplicon sequencing. Forward and reverse primers were designed to target a 431-base variable region that spans the V3 and V4 regions (ILU Crypto 18S; Table S1). Primers were modified to make them compatible with the iTru Adapterama indexes (ILU iTru Crypto 18S; Table S1).

Cryptosporidium 18S database

To identify Cryptosporidium species, we created a custom Cryptosporidium 18S reference data set. We curated the available 18S rRNA sequences on CryptoDB ( (36, 37) and added other Cryptosporidium species sequences from NCBI () to expand species representation. The resulting custom reference data set consisted of 110 18S rRNA sequences, representing all 44 recognized Cryptosporidium species and various genotypes and environmental samples. Where available, multiple sequences, including paralogs, were included to represent intra-species diversity (38). The database can be found at . Details on database curation can be found in the Supplemental methods.

Species identification and abundance estimation with DADA2

Paired-end Illumina reads were processed to identify Cryptosporidium species and estimate their relative abundance using DADA2 (v1.26.0) in R (v4.2.2) (34). Two rounds of taxonomy assignment were performed, beginning with genus determination by reference to the SILVA 132 NR99 SSU reference data set with a minimum bootstrap (minBoot) value of 50 (39). A minBoot of 50 is the default value and has been shown to be as accurate at genus-level identifications as a minBoot of 80 with the SILVA database (40). For species identification with our custom database, we optimized the minBoot value based on simulated sequencing from known specimens. For species of Cryptosporidium, we then performed a second round by reference to our custom 18S rRNA reference data set of known Cryptosporidium sequences, requiring a minimum bootstrap value of 55. To accommodate the possibility of a sequence representing a novel or highly divergent Cryptosporidium absent from the reference data set, we designated sequences assigned to the Cryptosporidium genus by reference to the SILVA database but lacking an exact match in our custom database as “Cryptosporidium sp.” Sequences unclassifiable at any taxonomic level were labeled as “unassigned.” Details on read processing and abundance estimation can be found in Supplemental methods.

In silicosimulated Cryptosporidium infections

Representative 18S rRNA sequences from various Cryptosporidium species were selected from NCBI. A “novel” Cryptosporidium species was created by concatenating two variable regions from C. andersoni (OQ001483) into a C. parvum (AF112572) backbone (Fig. S3). For mixed infections, simulated reads from each species were pooled at defined frequencies to generate mock communities in the abundance file. Illumina reads were simulated using InSilicoSeq (v 2.0.1) (41) and processed in the DADA2 pipeline. Details on representative sequences and InSilicoSeq settings can be found in Supplemental methods.

Cryptosporidium DNA sources

Cryptosporidium parvum DNA was isolated from purified oocysts (Sterling Parasitology Laboratory, University of Arizona, Tucson, AZ). DNA from C. hominis (NR-2520) and C. meleagridis (NR-2521) was obtained from BEI Resources (). DNA was amplified using the Repli-G Mini Kit (Qiagen, Hilden, Germany). Details on DNA isolation and amplification can be found in Supplemental methods.

Dilution assay and mock infections

For the dilution assay, 1 µL of C. parvum DNA, ranging in concentration from 10 to 0.001 ng/µL, was spiked into 1 µL of extracted DNA from a rabbit stool sample confirmed as negative for Cryptosporidium by qPCR.

Single-species mock infections were created using 2 ng of DNA from C. parvum, C. hominis, or C. meleagridis DNA. Also, a total of 2 ng C. parvum, C. hominis, and C. meleagridis DNA were combined in prescribed compositions to create mock mixed infections.

Illumina sequencing of 18S amplicon

18S amplicons were generated using Illumina iTru Crypto 18S primers (Table S1) (42). Amplifications were carried out with 5 µL extracted DNA and 2× Platinum II Hot-Start PCR Master Mix (Invitrogen, Massachusetts, USA) according to manufacturer’s instructions. Amplicons were then barcoded using Adapterama I iTru indexes (43). Samples were then pooled equally into a library for a final concentration of 4 nM and then prepared for sequencing using either Illumina V2 500 cycle Reagent Kit or Illumina V3 600 cycle Reagent Kit (Illumina, San Diego, CA), following manufacturer guidelines. Amplicons were sequenced using a MiSeq, and the resulting paired-end reads were analyzed using the DADA2 pipeline. Additional details can be found in Supplemental methods.

Sanger sequencing of Cryptosporidium-positive samples

Additional sample characterization was achieved by Sanger sequencing of amplicons from qPCR assays for 18S rDNA using external primers developed by Xiao et al. (Xiao Crypto 18S external; Table S1) (44). BigDye cycle sequencing in each direction employed internal 18S primers (Xiao Crypto 18S internal; Table S1) at a commercial sequencing service (Psomagen, Maryland, USA).

Hierarchical clustering

Forward and reverse reads were merged to create a consensus sequence using Geneious Prime 2020.2.4 software (Biomatters Ltd., Auckland, New Zealand; ). Sequences were aligned with the curated 18S Cryptosporidium database using Clustal X/W, and hierarchical clustering was achieved using Neighbor-Joining Geneious Prime (2024.0.2) with 1,000 bootstrap replicates (45). The consensus tree was generated with a midpoint root and 50% majority rule.

RESULTS

Developing amplicon sequencing for species-level identification of Cryptosporidium species

We leveraged known variation in the 18S rRNA gene, the most widely used marker for Cryptosporidium species identification (1), to improve means to detect and estimate the relative abundance of constituents in mixed-species infections. We sought a method that takes less time and achieves better characterization of mixed infections (22–24). To do so, we developed an amplicon sequencing approach targeting a 431-base region of the 18S rRNA gene which encompasses the V4 region. This area contains sufficient sequence diversity to distinguish recognized Cryptosporidium species while also fitting within the size constraints of Illumina sequencing (Fig. S1).

After designing sequencing primers for this amplicon, we performed NCBI Primer-BLAST analysis () and discovered that they are identical to other pathogens (Blastocystis, Toxoplasma, Neospora, Theileria, Babesia, Eimeria, Hammondia hammondi, Cystoisospora belli, Cyclospora, and Isospora). These other protozoan parasites also impose significant costs on human and animal health, worldwide. Thus, the tools described here have broad potential applications (Fig. S2).

Cryptosporidium species identification workflow

Our workflow extracts DNA from stool samples and then screens for Cryptosporidium by real-time qPCR (Fig. 1A). We considered samples with a C_T value of less than 40 positive. These were subjected to the described procedures for identifying and estimating the relative frequency of each species of Cryptosporidium.

View Image - FIG 1 Cryptosporidium species identification using 18S amplicon sequencing. (A) Schematic illustration of the amplicon sequencing and analyzing pipeline for species identification of Cryptosporidium from stool samples. DNA was extracted from stools and screened for Cryptosporidium by qPCR. 18S amplicons are generated from positive samples and sequenced using Illumina MiSeq. Reads were processed using the DADA2 pipeline. Taxonomy was determined using the DADA2 assignTaxonomy function in two steps: first at a genus level using the SILVA 132 reference database with a minimum bootstrap value of 50, then Cryptosporidium sequences were identified at a species level using a curated custom database and a minimum bootstrap value of 55. (B) Bar diagram of relative abundance of identified species in simulated single-species infections. Illumina amplicon reads were simulated using InSilicoSeq from selected Cryptosporidium species and a “novel” Cryptosporidium species made by intersplicing a variable region of C. andersoni into C. parvum. Simulated reads were mapped and processed using the DADA2 pipeline, and taxonomy was assigned using the two-step taxonomy assignment. Each color indicates one single species. (C) Relative abundance of species in simulated mixed infections. Illumina amplicon reads were simulated for mixed-species infections in defined proportions using InSilicoSeq. Reads were then processed using the DADA2 pipeline, and species were identified using the two-step taxonomy assignment. Each color represents one single species. The absolute percentage of the contributed simulated reads of each species in the mixed infection is represented in the table at the bottom of the graph. Blank space indicates no contribution of simulated reads.

FIG 1 Cryptosporidium species identification using 18S amplicon sequencing. (A) Schematic illustration of the amplicon sequencing and analyzing pipeline for species identification of Cryptosporidium from stool samples. DNA was extracted from stools and screened for Cryptosporidium by qPCR. 18S amplicons are generated from positive samples and sequenced using Illumina MiSeq. Reads were processed using the DADA2 pipeline. Taxonomy was determined using the DADA2 assignTaxonomy function in two steps: first at a genus level using the SILVA 132 reference database with a minimum bootstrap value of 50, then Cryptosporidium sequences were identified at a species level using a curated custom database and a minimum bootstrap value of 55. (B) Bar diagram of relative abundance of identified species in simulated single-species infections. Illumina amplicon reads were simulated using InSilicoSeq from selected Cryptosporidium species and a “novel” Cryptosporidium species made by intersplicing a variable region of C. andersoni into C. parvum. Simulated reads were mapped and processed using the DADA2 pipeline, and taxonomy was assigned using the two-step taxonomy assignment. Each color indicates one single species. (C) Relative abundance of species in simulated mixed infections. Illumina amplicon reads were simulated for mixed-species infections in defined proportions using InSilicoSeq. Reads were then processed using the DADA2 pipeline, and species were identified using the two-step taxonomy assignment. Each color represents one single species. The absolute percentage of the contributed simulated reads of each species in the mixed infection is represented in the table at the bottom of the graph. Blank space indicates no contribution of simulated reads.

DADA2 uses a naïve Bayesian classifier algorithm to assign sequence taxonomy, referencing a broad data set (34). Curating the contents of any such reference set lessens ambiguities or errors introduced by erroneous taxon assignments.

Therefore, we first used the SILVA 132 SSU NR99 database to determine the genus associated with each sequence (39). This database consists of over 690,000 sequences from all 3 major life branches, providing powerful means to distinguish among highly divergent organisms. However, it lacks sufficient detail to differentiate among all species of Cryptosporidium, as only representative sequences from C. parvum are present.

Therefore, we created a curated Cryptosporidium 18S rRNA database from sequences available at CryptoDB () (36, 37), adding selected sequences from NCBI (). Where available, we included multiple representative sequences including paralogs to capture intra-species diversity (38). We removed sequences with ambiguous species origin based on BLAST and hierarchical clustering, resulting in a database consisting of 110 sequences representing 44 recognized Cryptosporidium species and a wide variety of genotypes and environmental samples (). We performed a subsequent taxonomy assignment using this custom database to determine the Cryptosporidium species present.

In silico simulated Cryptosporidium infections

We tested our identification pipeline by simulating both single-species and mixed infections of Cryptosporidium in silico. Illumina reads of the 18S rRNA gene from C. cuniculus, C. hominis, C. parvum, C. meleagridis, C. muris, and C. andersoni were subjected to our DADA2 identification pipeline after being simulated using InSilicoSeq (41). When the simulated reads were mapped to both the SILVA 132 NR99 SSU reference data set and our curated Cryptosporidium 18S rRNA reference database, all reads for each sample aligned with the corresponding 18S rRNA sequences of each tested species, demonstrating accurate identification of the simulated species across all test cases (Fig. 1B). Importantly, the pipeline distinguished between closely related species, such as the most common species that infect humans (C. parvum, C. hominis, and C. cuniculus), which share >95% DNA sequence identity (Fig. 1B).

We then evaluated, in silico, our ability to estimate the composition of mixed infections employing 18S rRNA sequences from two or three human-infecting species in defined proportions (Fig. 1C). The 18S rRNA-based amplicon sequencing method correctly identified all species simulated in a sample and accurately estimated the simulated abundance in high abundance mixtures (>10%) at 30,000 reads (Fig. 1C). For minor mixed infections, a greater sequencing depth was required for accurate resolution of mixed infections (Fig. S4). Mixed infections of less than 0.3% were not detectable in this study based on sequence filtering thresholds. To assess whether the method could identify a new species of Cryptosporidium, we simulated a “novel” Cryptosporidium species by interspersing variable regions of C. andersoni into a C. parvum backbone (Fig. S3). Our two-step identification method successfully classified the sequence as belonging to an unnamed species in the genus Cryptosporidium (Fig. 1B).

In vitro mock infections

To further assess our 18S rRNA-based detection system, we created in vitro mock infections. These included single-species “infections” (adding 2 ng C. parvum, C. hominis, or C. meleagridis gDNA to rabbit stool). All ensuing reads were accurately assigned to the known species (Fig. 2A). When rabbit stool was seeded with DNA mixtures, the process identified only those constituents as present (Fig. 2C), in proportions that resembled (but did not perfectly reproduce) their target proportions. We discuss an apparent bias toward C. parvum below (“In vitro mock mixed infection”).

View Image - FIG 2 Estimation of sensitivity and specificity of 18S-based amplicon sequencing and the calculation of relative abundance of Cryptosporidium species identified in mock infections. (A) The specificity of Cryptosporidium detection using three different species. The specificity of the amplicon sequencing to identify different Cryptosporidium species is depicted in the relative abundance bar plots after assigning short reads of the 18S gene of C. parvum, C. hominis, and C. meleagridis in mock infections. (B) Sensitivity of Cryptosporidium detection in a complex stool background. Varying amounts of C. parvum DNA were diluted with DNA extracted from rabbit stools that was negative for Cryptosporidium. Samples were then amplified for 18S using iTru Crypto 18S primers, sequenced, and species identifications were performed using our DADA2 pipeline. The Y-axis indicates the relative abundance, whereas the X-axis indicates the amount of C. parvum gDNA (ng) pooled into rabbit gDNA. CT value represents the qPCR CT value corresponding to each dilution. (C) Precious detection of mixed infections of three closely related Cryptosporidium species in mock experiments. A total of 2 ng C. parvum, C. hominis, and C. meleagridis DNA were combined in prescribed compositions to create mock infections. Samples were then amplified for 18S using iTru Crypto 18S primers, sequenced, and species identifications were performed using our DADA2 pipeline. The Y-axis indicates the relative abundance of short reads sequenced from each species. The X-axis indicates the prescribed composition to create each mock infection. The percentage of each species’ contribution to creating each mock mixed infection is depicted at the bottom of the figure.

FIG 2 Estimation of sensitivity and specificity of 18S-based amplicon sequencing and the calculation of relative abundance of Cryptosporidium species identified in mock infections. (A) The specificity of Cryptosporidium detection using three different species. The specificity of the amplicon sequencing to identify different Cryptosporidium species is depicted in the relative abundance bar plots after assigning short reads of the 18S gene of C. parvum, C. hominis, and C. meleagridis in mock infections. (B) Sensitivity of Cryptosporidium detection in a complex stool background. Varying amounts of C. parvum DNA were diluted with DNA extracted from rabbit stools that was negative for Cryptosporidium. Samples were then amplified for 18S using iTru Crypto 18S primers, sequenced, and species identifications were performed using our DADA2 pipeline. The Y-axis indicates the relative abundance, whereas the X-axis indicates the amount of C. parvum gDNA (ng) pooled into rabbit gDNA. CT value represents the qPCR CT value corresponding to each dilution. (C) Precious detection of mixed infections of three closely related Cryptosporidium species in mock experiments. A total of 2 ng C. parvum, C. hominis, and C. meleagridis DNA were combined in prescribed compositions to create mock infections. Samples were then amplified for 18S using iTru Crypto 18S primers, sequenced, and species identifications were performed using our DADA2 pipeline. The Y-axis indicates the relative abundance of short reads sequenced from each species. The X-axis indicates the prescribed composition to create each mock infection. The percentage of each species’ contribution to creating each mock mixed infection is depicted at the bottom of the figure.

In vitro sensitivity assessment of Cryptosporidium detection

To test the sensitivity of our approach, we spiked between 10 and 0.001 ng of C. parvum DNA into DNA from a complex stool background (Fig. 2B). All or nearly all such reads attributed to Cryptosporidium were correctly assigned to C. parvum, providing one measure of process validity. From stool samples containing no natural or added Cryptosporidium, no 18S reads were falsely ascribed to the genus (providing further assurance of method specificity).

Using qPCR standardized in the Global Enteric Multicenter Study (5, 6), we found that a C_T value of 28.1 (corresponding to 0.001 ng of C. parvum gDNA) was optimal for detecting constituents when present at a relative abundance of 25%. Interestingly, the proportion of non-Cryptosporidium reads decreased significantly as C. parvum input gDNA amount increases.

In vitro mock mixed infection

To determine whether the method can effectively differentiate mixed infections and calculate the relative abundance of each species in a sample, we created artificial mixtures by combining 2 ng of gDNA from C. parvum, C. hominis, and C. meleagridis in defined amounts. The abundance of the mixed infections revealed an overrepresentation of C. parvum in all mixtures (Fig. 2C). This was particularly noticeable in mixed infections with C. meleagridis, where the method overestimated C. parvum abundance by up to twofold (compared to the input concentration we attempted to add) (Fig. 2C). We therefore sought to attribute such discrepancies to preferential amplification of C. parvum or to error in estimating gDNA concentrations added to the mixtures. To investigate this further, we assessed the C_T values from qPCR using 2 ng of total gDNA from C. parvum, C. hominis, and C. meleagridis. The qPCR results confirmed that we had added more C. parvum to the mixtures than we had thought. The respective C_T values for C. parvum, C. hominis, and C. meleagridis were 14.36, 15.24, and 15.56, respectively. This suggests that C. parvum enjoyed a twofold concentration advantage over C. meleagridis. Although this does not formally exclude preferential amplification of C. parvum, it does suggest the read abundance is an indicator of true mixture composition.

Prevalence of Cryptosporidium in Egyptian rabbits

We tested our entire pipeline using rabbit stool samples from the Dakahlia governorate in Northern Egypt. Previous studies estimated that 11.9% of this region’s rabbits harbor infections with C. cuniculus (46). We collected 224 rabbit stool samples from 12 different farms; animal age ranged from 2 to 12 months. Total fecal DNA was extracted from the stool samples and screened for Cryptosporidium using quantitative PCR. Of these, 17 samples tested positive by qPCR, exhibiting high C_T values (>35) suggesting low parasite load. Repeat testing did not yield consistent results, likely owing to low target abundance (Table S2). Of these 17 presumptively positive samples, our tool identified Cryptosporidium in 8 samples. Most of these were ascribed to C. parvum (Fig. 3B). In four of these samples, Cryptosporidium reads accounted for fewer than 2% of all 18S reads; this is likely due to the presence of very few Cryptosporidium oocysts (C_T value ranges from 38.61 to 41.30) in these samples, and 18S primers amplified other pathogens as mentioned previously. A minor presence of C. hominis, C. meleagridis, and an undetermined Cryptosporidium sp. was detected in six samples. Of these, four showed evidence of mixed infection (R034, R154, R186, and R195). Additionally, reads from unspecified species were detected in five of these rabbit fecal samples (black bars, Fig. 3B).

View Image - FIG 3 Detection and identification of Cryptosporidium species present in rabbit and clinical samples. (A) Map of Egypt with the sample origin governorates colored by the number of clinical samples from each governorate. (B) Relative abundance of Cryptosporidium species identified in rabbit and human clinical samples from Egypt. Samples were processed according to the pipeline shown in Fig. 1A. Briefly, stool samples were amplified for 18S using iTru Crypto 18S primers and sequenced using Illumina MiSeq. Species identifications were performed utilizing our DADA2 pipeline with SILVA 132 reference database, followed by the curated custom Cryptosporidium database as mentioned in Fig. 1. Each color indicates each species, except black, representing unassigned, and gray, representing non-Cryptosporidium species. Each bar represents the relative distribution of Cryptosporidium species in each sample. (C) Confirmation of 18S amplicon sequencing-based species identification with conventional genotyping methodology using Sanger sequencing of an amplicon amplified and sequenced by Crypto 18S primer sets. A neighbor-joining phylogenetic tree was constructed using Geneious Prime (version 2024.0.2; Biomatters Ltd., Auckland, New Zealand; ) and formatted using Itol () with 1,000 bootstrap replicates after aligning the sequences using Cluster X/W. Cryptosporidium species names followed by NCBI accession number () are provided in the phylogenetic tree. Sequenced samples from the current study are depicted in red. The circle size at each node indicates the bootstrap values.

FIG 3 Detection and identification of Cryptosporidium species present in rabbit and clinical samples. (A) Map of Egypt with the sample origin governorates colored by the number of clinical samples from each governorate. (B) Relative abundance of Cryptosporidium species identified in rabbit and human clinical samples from Egypt. Samples were processed according to the pipeline shown in Fig. 1A. Briefly, stool samples were amplified for 18S using iTru Crypto 18S primers and sequenced using Illumina MiSeq. Species identifications were performed utilizing our DADA2 pipeline with SILVA 132 reference database, followed by the curated custom Cryptosporidium database as mentioned in Fig. 1. Each color indicates each species, except black, representing unassigned, and gray, representing non-Cryptosporidium species. Each bar represents the relative distribution of Cryptosporidium species in each sample. (C) Confirmation of 18S amplicon sequencing-based species identification with conventional genotyping methodology using Sanger sequencing of an amplicon amplified and sequenced by Crypto 18S primer sets. A neighbor-joining phylogenetic tree was constructed using Geneious Prime (version 2024.0.2; Biomatters Ltd., Auckland, New Zealand; ) and formatted using Itol () with 1,000 bootstrap replicates after aligning the sequences using Cluster X/W. Cryptosporidium species names followed by NCBI accession number () are provided in the phylogenetic tree. Sequenced samples from the current study are depicted in red. The circle size at each node indicates the bootstrap values.

To confirm species identifications and to compare the sensitivity of the Illumina-based detection system, we utilized the well-established Cryptosporidium species identification primers developed by Xiao et al. to amplify an ~1,300-base pair fragment of the 18S rRNA for use with Sanger sequencing. (44). We successfully amplified fragments from only four samples (R034, R122, R124, and R186). Sanger sequences of these four samples clustered with C. parvum in an unrooted phylogenetic tree (Fig. 3C; Fig. S5), reinforcing the identifications made through Illumina sequencing. Notably, none of the samples exhibited ambiguous bases expected when Sanger sequencing is applied, directly without cloning, to amplifications derived from mixed templates. However, the methods introduced here identified two samples (R034 and R124) with faint signatures of mixed infections involving C. meleagridis, C. hominis, and unidentified Cryptosporidium species. Thus, our new methods appear more sensitive for identifying mixed infections of Cryptosporidium.

Prevalence of Cryptosporidium in Egyptian children

To understand whether our Illumina-based detection system can accurately identify the species of Cryptosporidium present in children, determine the role of zoonotic transmission, and accurately calculate the rate of mixed infections, we collected 330 clinical samples between May and August 2023 from 4 Egyptian governorates: Dakahlia, Damietta, Gharbia, and Sharqia, located in the Nile Delta (Fig. 3B). The samples were taken from children aged between 6 months and 10 years.

From 330 samples derived from Egyptian children, our NGS method identified 22 (6.6%) positive samples for one or more species of Cryptosporidium. Of these, 14 came from 221 hospitalized patients and 8 from 109 persons visiting clinics (Table S3). We did not find any significant correlation of infections with the distribution of age, sex, or urban vs rural sampling regions (Table S3). The oocyst burden in the clinical samples varied extensively, indicated by a wide range of C_T values in qPCR from 16 to 38 cycles.

To identify the Cryptosporidium species present in these clinical samples, we used two approaches: (i) conventional species detection utilizing Sanger sequencing with commonly used PCR primers and (ii) species-level detection using 18S rRNA-based amplicon sequencing. For the conventional detection via Sanger sequencing, we attempted to amplify the 18S rRNA gene using all 22 qPCR-positive samples (Table S3; Fig. 3B). Only 6 of these 22 samples yielded a positive amplicon for Sanger sequencing. This sequencing identified C. hominis in all such positive samples (Table S3; Fig. 3B). Five of these samples (and no other sample) tested positive using our new amplicon sequencing method. Amplicon-based Sanger sequencing confirmed C. hominis in all these samples, with no indications of mixed infections (Table S3; Fig. 3B and C; Fig. S5). These results reaffirm C. hominis as a major causative agent of cryptosporidiosis in Egyptian children.

DISCUSSION

Although recent developments in molecular detection and genotyping methodology identified many cryptic species of Cryptosporidium, existing genotyping tools fail to accurately assess the relative abundance of the constituents of mixed infections. Therefore, we developed a sensitive and specific 18S amplicon-based short-read sequencing pipeline, followed by a two-step species assignment using the SILVA SSU database and a curated Cryptosporidium 18S rRNA gene database. We determined this process to be specific, sensitive, and accurate in estimating the contributions of even closely related species of Cryptosporidium comprising mixed infections. Furthermore, our amplicon-based detection system proved capable of identifying novel species of Cryptosporidium.

We then applied this procedure to real veterinary and human fecal samples, estimating prevalence rates and species compositions in a sample of rabbits and a sample of children. Rabbit samples identified C. parvum and an undescribed species of Cryptosporidium. Four of eight rabbit samples appeared to comprise mixed-species infections. We estimated Cryptosporidium in 6.6% of the sampled Egyptian children. Most Cryptosporidium-specific reads derived from these children corresponded to C. hominis, consistent with prior estimates derived from existing genotyping methods. The amplicon-based system afforded greater means to identify mixed infections and unclassifiable sequences, which deserve further study as possibly undescribed parasite species. Thus, amplicon sequencing enables new insights concerning the dynamics of cryptosporidiosis, which should enhance disease control strategies employing targeted interventions.

The molecular detection of different Cryptosporidium species is still in its infancy, and a cost-effective diagnostic tool for Cryptosporidium that offers improved sensitivity, specificity, and determines mixed infection is urgently needed. In enhancing the sensitivity and reducing the assay time, quantitative PCR has become an integral tool for the screening of environmental pathogens (47, 48). Recently, notable progress has been made in the validation of fluorescence in situ hybridization probes for the identification of the human infectious Cryptosporidium species, specifically C. parvum and C. hominis (49). Probes Cpar677 (targeting C. parvum) and Chrom253 (targeting C. hominis) were validated by Alagappan et al. (50). The specificity and cross-reactivity of these probes were assessed against only eight Cryptosporidium species, including C. andersoni, C. muris, C. meleagridis, and C. felis. Furthermore, through careful primer design and melting curve analysis, it has now been established that pathogenic Cryptosporidium species can be differentiated from non-pathogenic ones (47, 48). An important illustration of this capability was demonstrated by Li et al. (51), who employed fluorescence resonance energy transfer probes and melting curve analysis in the 18S-LC1 and hsp90 assays to distinguish common human-pathogenic species, such as C. parvum, C. hominis, and C. meleagridis. The 18S-LC2 assay was also effective in differentiating non-pathogenic species, such as C. andersoni, from pathogenic species frequently detected in source water (51). Hence, to overcome these challenges and develop a methodology to identify not only the known species but also the relative abundance of mixed infection and identification of the presence of new species, we developed a highly sensitive (0.001 ng target DNA, C_T = 28) 18S-based amplicon sequencing for Cryptosporidium.

Over the past decades, the emergence of readily accessible high-throughput sequencing of amplicons derived from the 18S rRNA gene has fundamentally transformed the fields of clinical and public health microbiology (52–54). The advancement of Cryptosporidium detection using amplicon sequencing has not only accelerated the process of pathogen identification but has also enhanced the accuracy of these identifications. Moreover, the integration of high-throughput methodologies with sophisticated bioinformatics tools will allow researchers to gain deeper insights into various critical aspects of infectious diseases, including how they are transmitted and the role of mixed infection in disease susceptibility. This approach is particularly vital for evaluating the diversity of Cryptosporidium in different ecosystems.

Previously, metabarcoding assays enabled the successful detection of protozoan parasites, including Cryptosporidium (54–56). However, none of those methodologies provided the tools to identify all known species, particularly closely related and novel species of Cryptosporidium. Taxonomy can only be assigned if a representative sequence is present in the reference data set. However, it is also possible that a Cryptosporidium sequence could originate a novel, highly divergent, or unrepresented species that is not found in the reference data set and thus would not be assigned a species. To counteract this, we leveraged the genus assignment from the SILVA database so that any sequences identified as originating from the Cryptosporidium genus are considered unspecified Cryptosporidium sp. Our two-step detection system using the SILVA database and a curated Cryptosporidium 18S rRNA gene database provides the tools to detect all known Cryptosporidium species in addition to unknown novel species.

The phenomenon of co-infections, where multiple species of eukaryotic pathogens simultaneously infect a single host, can have profound implications not only for the parasites themselves but also for their hosts. These interactions can manifest in various ways, including synergistic effects that enhance the virulence or spread of infections or antagonistic effects that may hinder the establishment and impact of one or both parasites (57). Thus, it is critical to determine the frequency of mixed infections and the relative abundance of the mixed infected species present in animal and clinical samples. It has been documented that mixed infection of Cryptosporidium at the species level is common in animal hosts. Few reports are available on clinical samples, which could be due to multiple factors, including host immune response and low resolution of the current Sanger sequencing-based detection systems.

Our amplicon sequencing method correctly assigned and determined the relative frequency of each species, including close and distantly related species of C. parvum, using in silico and mock experiments. A notable study that employed amplicon-based next-generation sequencing to identify various Cryptosporidium species found that an alarming 30% of the infected animals studied had mixed infections. Detection of mixed infections was made possible through NGS as Illumina sequencing technology conducts sequencing of each strand with massively parallel capabilities (58).

Cryptosporidium species are widely distributed among humans and animals, exhibiting both anthroponotic and zoonotic transmission cycles in Egypt (59). Several species infecting humans and animals have been reported in the country, including C. parvum, C. hominis, C. meleagridis, C. ryanae, C. andersoni, C. xiaoi, and C. bovis. Among these, C. parvum is the most prevalent, followed by C. hominis in human infections in individuals under 10 years old (59). To gain a deeper understanding of the epidemiology, including transmission dynamics and differences in clinical presentations, as well as the frequency of mixed infections, it is essential to conduct more extensive molecular typing studies in Egypt. In this context, we carried out a surveillance study involving clinical and rabbit samples. Our findings revealed a prevalence rate of 7.6% in rabbit samples and 6.6% in clinical samples, which aligns with previously published prevalence rates of Cryptosporidium in animals and clinical cases (46, 60–62).

The comparative analysis of sensitivity between qPCR and amplicon sequencing revealed that samples with a C_T value above 31 are generally negative in amplicon sequencing, limiting the number of samples entered into the amplicon-based species detection pipeline. This finding confirms that the real-time PCR assay has higher sensitivity compared to the metabarcoding assay. It is crucial to note that the accuracy of quantitative detection via qPCR is contingent on the reliability of the standard curve, which can be influenced by pipetting errors and potential DNA losses. Amplicon sequencing showed that the clinical samples were primarily infected with C. hominis, whereas C. parvum was the major species detected in rabbit samples. This finding was also supported by the conventional method of species detection with the Sanger sequencing of the 18S gene. Notably, amplicon-based sequencing identified trace amounts of short reads from other Cryptosporidium species in several rabbit samples alongside C. parvum reads, indicating the occurrence of mixed infections in Egyptian rabbits.

Taxonomic profiling of complex microbial communities using 16S rRNA marker gene surveys has garnered significant interest, providing valuable insights into the bacterial composition of these communities and their links to health and disease (63, 64). To deepen our understanding of species diversity and the relative abundance of mixed infections in both animal and clinical samples, we developed a highly specific and sensitive stepwise protocol that employs amplicon sequencing of the hypervariable regions of the 18S rRNA gene. While this protocol allows us to determine species composition and the relative abundance of parasite burden in clinical samples, we will need high-resolution and sensitive genotyping tools, such as capture enrichment (65, 66) or single-cell sequencing (67), for a more comprehensive investigation. These advanced methods could prove instrumental in unraveling the genetic complexity of this parasite, thereby enhancing the diagnosis and understanding of the genetic basis of cryptosporidiosis. The insights obtained from these findings can be applied to refine parasite detection methodologies, ultimately strengthening efforts to control and prevent intestinal parasitic infections.

ACKNOWLEDGMENTS

We acknowledge CryptoDB () (37, 38) for providing a publicly available repository for Cryptosporidium genomic data. CryptoDB is part of the Eukaryotic Pathogen, Vector, and Host Informatics Resources (VEuPathDB; ). VEuPathDB receives funding from the Wellcome Trust (UK) to support informatics efforts focusing on kinetoplastida and fungal organisms with special emphasis on improving functional annotation of genomes. Additional computing resources were provided by the SciNet HPC Consortium ().

This research was supported in part by an appointment of Randi Turner and Doaa Naguib to the Agricultural Research Service (ARS) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA). ORISE is managed by Oak Ridge Associated Universities (ORAU) under DOE contract number DE-SC 0014664. This work was financially supported by USDA CRIS Project 8042-32420-007-00D to R.T., D.N., M.V., B.M.R., and A.K., National Institute for Allergy and Infectious Diseases, NIAID, R01 AI148667 to R.T., T.C.G., and J.C.K.

Conceptualization: A.K. and R.T., Methodology: A.K., R.T., D.N., E.P., A.L., and M.V.; Software: A.K. and R.T.; Analysis: A.K., R.T., and D.N.; Resources: A.K., D.N., T.C.G., B.M.R., and J.C.K.; Data curation: R.T. and D.N.; Original draft preparation: A.K. and R.T.; Review and editing: A.K., R.T., B.M.R., and J.C.K.; Supervision: A.K.; Project administration: A.K.; Funding acquisition: A.K., T.C.G., B.M.R., and J.K. All authors have read and agreed to the published version of the manuscript.

References

Kotloff KL, Nataro JP, Blackwelder WC, Nasrin D, Farag TH, Panchalingam S, Wu Y, Sow SO, Sur D, Breiman RF, et al. 2013. Burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the Global Enteric Multicenter Study, GEMS): a prospective, case-control study. Lancet 382:209–222.

Shirley D-A, Moonah SN, Kotloff KL. 2012. Burden of disease from cryptosporidiosis. Curr Opin Infect Dis 25:555–563.

Samie A, Hlungwani AH, Mbati PA. 2017. Prevalence and risk factors of Cryptosporidium species among domestic animals in rural communities in Northern South Africa. Trop Biomed 34:636–647.

Samie A, Bessong PO, Obi CL, Sevilleja JEAD, Stroup S, Houpt E, Guerrant RL. 2006. Cryptosporidium species: preliminary descriptions of the prevalence and genotype distribution among school children and hospital patients in the Venda region, Limpopo Province, South Africa. Exp Parasitol 114:314–322.

Sow SO, Muhsen K, Nasrin D, Blackwelder WC, Wu Y, Farag TH, Panchalingam S, Sur D, Zaidi AKM, Faruque ASG, et al. 2016. The burden of Cryptosporidium diarrheal disease among children < 24 months of age in moderate/high mortality regions of Sub-Saharan Africa and South Asia, utilizing data from the global enteric multicenter study (GEMS). PLoS Negl Trop Dis 10:e0004729.

Troeger C, Forouzanfar M, Rao PC, Khalil I, Brown A, Reiner RC Jr, Fullman N, Thompson RL, Abajobir A, Ahmed M, et al. 2017. Estimates of global, regional, and national morbidity, mortality, and aetiologies of diarrhoeal diseases: a systematic analysis for the global burden of disease study 2015. Lancet Infect Dis 17:909–948.

Dong S, Yang Y, Wang Y, Yang D, Yang Y, Shi Y, Li C, Li L, Chen Y, Jiang Q, Zhou Y. 2020. Prevalence of Cryptosporidium infection in the global population: a systematic review and meta-analysis. Acta Parasitol 65:882–889.

Wang W, Wan M, Yang F, Li N, Xiao L, Feng Y, Guo Y. 2021. Development and Application of a gp60-Based Subtyping Tool for Cryptosporidium bovis. Microorganisms 9:2067.

Putignani L, Menichella D. 2010. Global distribution, public health and clinical impact of the protozoan pathogen cryptosporidium. Interdiscip Perspect Infect Dis 2010:753512.

Risby H, Robinson G, Chandra N, King G, Vivancos R, Smith R, Thomas D, Fox A, McCarthy N, Chalmers RM. 2023. Application of a new multi-locus variable number tandem repeat analysis (MLVA) scheme for the seasonal investigation of Cryptosporidium parvum cases in Wales and the northwest of England, spring 2022. Curr Res Parasitol Vector-Borne Dis 4:100151.

Cacciò SM, Chalmers RM. 2016. Human cryptosporidiosis in Europe. Clin Microbiol Infect 22:471–480.

Hlavsa MC, Cikesh BL, Roberts VA, Kahler AM, Vigar M, Hilborn ED, Wade TJ, Roellig DM, Murphy JL, Xiao L, Yates KM, Kunz JM, Arduino MJ, Reddy SC, Fullerton KE, Cooley LA, Beach MJ, Hill VR, Yoder JS. 2018. Outbreaks associated with treated recreational water — United States, 2000–2014. Morbid Mortal Wkly Rep 67:547–551.

European Centre for Disease Prevention and Control. 2019. Cryptosporidiosis—annual epidemiological report for 2017. ECDC, Stockholm

O’Leary JK, Sleator RD, Lucey B. 2021. Cryptosporidium spp. diagnosis and research in the 21st Century. Food and Waterborne Parasitology 24:e00131.

Ryan U, Zahedi A, Feng Y, Xiao L. 2021. An update on zoonotic Cryptosporidium species and genotypes in humans. Animals (Basel) 11:3307.

Ryan UM, Feng Y, Fayer R, Xiao L. 2021. Taxonomy and molecular epidemiology of Cryptosporidium and Giardia - a 50 year perspective (1971-2021). Int J Parasitol 51:1099–1119.

Xiao L, Feng Y. 2017. Molecular epidemiologic tools for waterborne pathogens Cryptosporidium spp. and Giardia duodenalis. Food and Waterborne Parasitology 8–9:14–32.

Roellig DM, Xiao L. 2020. Cryptosporidium genotyping for epidemiology tracking. Methods Mol Biol 2052:103–116.

Elwin K, Robinson G, Pérez-Cordón G, Chalmers RM. 2022. Development and evaluation of a real-time PCR for genotyping of Cryptosporidium spp. from water monitoring slides. Exp Parasitol 242:108366.

Hassan EM, Örmeci B, DeRosa MC, Dixon BR, Sattar SA, Iqbal A. 2021. A review of Cryptosporidium spp. and their detection in water. Water Sci Technol 83:1–25.

Barbosa AD, Gofton AW, Paparini A, Codello A, Greay T, Gillett A, Warren K, Irwin P, Ryan U. 2017. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus) and their ticks identified using next-generation sequencing (NGS). PLoS One 12:e0181279.

Baptista RP, Cooper GW, Kissinger JC. 2021. Challenges for Cryptosporidium population studies. Genes (Basel) 12:894.

Dettwiler I, Troell K, Robinson G, Chalmers RM, Basso W, Rentería-Solís ZM, Daugschies A, Mühlethaler K, Dale MI, Basapathi Raghavendra J, Ruf M-T, Poppert S, Meylan M, Olias P. 2022. TIDE analysis of Cryptosporidium infections by gp60 typing reveals obscured mixed infections. J Infect Dis 225:686–695.

Cama V, et al. 2006. Mixed Cryptosporidium infections and HIV. Emerging Infect Dis 12:1025–1028.

Tanriverdi S, Grinberg A, Chalmers RM, Hunter PR, Petrovic Z, Akiyoshi DE, London E, Zhang L, Tzipori S, Tumwine JK, Widmer G. 2008. Inferences about the global population structures of Cryptosporidium parvum and Cryptosporidium hominis. Appl Environ Microbiol 74:7227–7234.

Grinberg A, Biggs PJ, Dukkipati VSR, George TT. 2013. Extensive intra-host genetic diversity uncovered in Cryptosporidium parvum using next generation sequencing. Infect Genet Evol 15:18–24.

Zahedi A, Gofton AW, Jian F, Paparini A, Oskam C, Ball A, Robertson I, Ryan U. 2017. Next generation sequencing uncovers within-host differences in the genetic diversity of Cryptosporidium gp60 subtypes. Int J Parasitol 47:601–607.

Awad-el-Kariem FM, Warhurst DC, McDonald V. 1994. Detection and species identification of Cryptosporidium oocysts using a system based on PCR and endonuclease restriction. Parasitology 109 (Pt 1):19–22.

Leng X, Mosier DA, Oberst RD. 1996. Differentiation of Cryptosporidium parvum, C. muris, and C. baileyi by PCR-RFLP analysis of the 18S rRNA gene. Vet Parasitol 62:1–7.

Xiao L, Escalante L, Yang C, Sulaiman I, Escalante AA, Montali RJ, Fayer R, Lal AA. 1999. Phylogenetic analysis of Cryptosporidium parasites based on the small-subunit rRNA gene locus. Appl Environ Microbiol 65:1578–1583.

Harbuzov Z, Farberova V, Tom M, Pallavicini A, Stanković D, Lotan T, Lubinevsky H. 2022. Amplicon sequence variant-based meiofaunal community composition revealed by DADA2 tool is compatible with species composition. Mar Genomics 65:100980.

Nasereddin A, Ereqat S, Al-Jawabreh A, Taradeh M, Abbasi I, Al-Jawabreh H, Sawalha S, Abdeen Z. 2022. Concurrent molecular characterization of sand flies and Leishmania parasites by amplicon-based next-generation sequencing. Parasit Vectors 15:262.

Yang R, Palermo C, Chen L, Edwards A, Paparini A, Tong K, Gibson-Kueh S, Lymbery A, Ryan U. 2015. Genetic diversity of Cryptosporidium in fish at the 18S and actin loci and high levels of mixed infections. Vet Parasitol 214:255–263.

Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. 2016. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583.

Stroup SE, Roy S, Mchele J, Maro V, Ntabaguzi S, Siddique A, Kang G, Guerrant RL, Kirkpatrick BD, Fayer R, Herbein J, Ward H, Haque R, Houpt ER. 2006. Real-time PCR detection and speciation of Cryptosporidium infection using scorpion probes. J Med Microbiol 55:1217–1222.

Puiu D, Enomoto S, Buck GA, Abrahamsen MS, Kissinger JC. 2004. CryptoDB: the Cryptosporidium genome resource. Nucleic Acids Res 32:D329–31.

Heiges M, Wang H, Robinson E, Aurrecoechea C, Gao X, Kaluskar N, Rhodes P, Wang S, He C-Z, Su Y, Miller J, Kraemer E, Kissinger JC. 2006. CryptoDB: a Cryptosporidium bioinformatics resource update. Nucleic Acids Res 34:D419–22.

Le Blancq SM, Khramtsov NV, Zamani F, Upton SJ, Wu TW. 1997. Ribosomal RNA gene organization in Cryptosporidium parvum. Mol Biochem Parasitol 90:463–478.

Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, Schweer T, Peplies J, Ludwig W, Glöckner FO. 2014. The SILVA and “all-species living tree project (LTP)” taxonomic frameworks. Nucleic Acids Res 42:D643–8.

Smith PE, Waters SM, Gómez Expósito R, Smidt H, Carberry CA, McCabe MS. 2020. Synthetic sequencing standards: a guide to database choice for rumen microbiota amplicon sequencing analysis. Front Microbiol 11:606825.

Gourlé H, Karlsson-Lindsjö O, Hayer J, Bongcam-Rudloff E. 2019. Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics 35:521–522.

Glenn TC, Pierson TW, Bayona-Vásquez NJ, Kieran TJ, Hoffberg SL, Thomas Iv JC, Lefever DE, Finger JW, Gao B, Bian X, et al. 2019. Adapterama II: universal amplicon sequencing on Illumina platforms (TaggiMatrix). PeerJ 7:e7786.

Glenn TC, Nilsen RA, Kieran TJ, Sanders JG, Bayona-Vásquez NJ, Finger JW, Pierson TW, Bentley KE, Hoffberg SL, Louha S, Garcia-De Leon FJ, Del Rio Portilla MA, Reed KD, Anderson JL, Meece JK, Aggrey SE, Rekaya R, Alabady M, Belanger M, Winker K, Faircloth BC. 2019. Adapterama I: universal stubs and primers for 384 unique dual-indexed or 147,456 combinatorially-indexed Illumina libraries (iTru & iNext). PeerJ 7:e7755.

Xiao L, Morgan UM, Limor J, Escalante A, Arrowood M, Shulaw W, Thompson RC, Fayer R, Lal AA. 1999. Genetic diversity within Cryptosporidium parvum and related Cryptosporidium species. Appl Environ Microbiol 65:3386–3391.

Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948.

Naguib D, Roellig DM, Arafat N, Xiao L. 2021. Genetic characterization of Cryptosporidium cuniculus from Rabbits in Egypt. Pathogens 10:775.

Burnet JB, Ogorzaly L, Tissier A, Penny C, Cauchie HM. 2013. Novel quantitative TaqMan real-time PCR assays for detection of Cryptosporidium at the genus level and genotyping of major human and cattle-infecting species. J Appl Microbiol 114:1211–1222.

Abrahamsen MS, Templeton TJ, Enomoto S, Abrahante JE, Zhu G, Lancto CA, Deng M, Liu C, Widmer G, Tzipori S, Buck GA, Xu P, Bankier AT, Dear PH, Konfortov BA, Spriggs HF, Iyer L, Anantharaman V, Aravind L, Kapur V. 2004. Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304:441–445.

DeLong EF, Wickham GS, Pace NR. 1989. Phylogenetic stains: ribosomal RNA-based probes for the identification of single cells. Science 243:1360–1363.

Alagappan A, Bergquist PL, Ferrari BC. 2009. Development of a two-color fluorescence in situ hybridization technique for species-level identification of human-infectious Cryptosporidium spp. Appl Environ Microbiol 75:5996–5998.

Li N, Neumann NF, Ruecker N, Alderisio KA, Sturbaum GD, Villegas EN, Chalmers R, Monis P, Feng Y, Xiao L. 2015. Development and evaluation of three real-time PCR assays for genotyping and source tracking Cryptosporidium spp. in water. Appl Environ Microbiol 81:5845–5854.

Chihi A, O’Brien Andersen L, Aoun K, Bouratbine A, Stensvold CR. 2022. Amplicon-based next-generation sequencing of eukaryotic nuclear ribosomal genes (metabarcoding) for the detection of single-celled parasites in human faecal samples. Parasite Epidemiol Control 17:e00242.

Popovic A, Parkinson J. 2018. Characterization of eukaryotic microbiome using 18S Amplicon sequencing. Methods Mol Biol 1849:29–48.

Kang D, Choi JH, Kim M, Yun S, Oh S, Yi M, Yong T-S, Lee YA, Shin MH, Kim JY. 2024. Optimization of 18 S rRNA metabarcoding for the simultaneous diagnosis of intestinal parasites. Sci Rep 14:25049.

DeMone C, Trenton McClure J, Greenwood SJ, Fung R, Hwang M-H, Feng Z, Shapiro K. 2021. A metabarcoding approach for detecting protozoan pathogens in wild oysters from Prince Edward Island, Canada. Int J Food Microbiol 360:109315.

DeMone C, Hwang M-H, Feng Z, McClure JT, Greenwood SJ, Fung R, Kim M, Weese JS, Shapiro K. 2020. Application of next generation sequencing for detection of protozoan pathogens in shellfish. Food Waterborne Parasitol 21:e00096.

Lorenzi H, Khan A, Behnke MS, Namasivayam S, Swapna LS, Hadjithomas M, Karamycheva S, Pinney D, Brunk BP, Ajioka JW, et al. 2016. Local admixture of amplified and diversified secreted pathogenesis determinants shapes mosaic Toxoplasma gondii genomes. Nat Commun 7:10147.

Rotovnik R, Lathrop TS, Skov J, Jokelainen P, Kapel CMO, Stensvold CR. 2024. Detection of zoonotic Cryptosporidium spp. in small wild rodents using amplicon-based next-generation sequencing. Parasite Epidemiol Control 24:e00332.

Hijjawi N, Zahedi A, Al-Falah M, Ryan U. 2022. A review of the molecular epidemiology of Cryptosporidium spp. and Giardia duodenalis in the Middle East and North Africa (MENA) region. Infect Genet Evol 98:105212.

Mohammad SM, Ali MS, Abdel-Rahman SA, Moustafa RA, Sarhan MH. 2021. Genotyping of Cryptosporidium species in children suffering from diarrhea in Sharkyia Governorate, Egypt. J Infect Dev Ctries 15:1539–1546.

El-Badry AA, Al-Antably ASA, Hassan MA, Hanafy NA, Abu-Sarea EY. 2015. Molecular seasonal, age and gender distributions of Cryptosporidium in diarrhoeic Egyptians: distinct endemicity. Eur J Clin Microbiol Infect Dis 34:2447–2453.

Naguib D, El-Gohary AH, Roellig D, Mohamed AA, Arafat N, Wang Y, Feng Y, Xiao L. 2018. Molecular characterization of Cryptosporidium spp. and Giardia duodenalis in children in Egypt. Parasit Vectors 11:403.

Clarridge JE. 2004. Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases. Clin Microbiol Rev 17:840–862.

Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Vega Thurber RL, Knight R, Beiko RG, Huttenhower C. 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31:814–821.

Bayona-Vásquez NJ, Sullivan AH, Beaudry MS, Khan A, Baptista RP, Petersen KN, Bhuiyan M, Brunelle B, Robinson G, Chalmers RM, Alves-Ferreira E, Grigg ME, Kissinger JC, Glenn TC. 2024. Whole genome targeted enrichment and sequencing of human-infecting Cryptosporidium spp. Res Sq:rs.3.rs-4294842.

Khan A, Alves-Ferreira EVC, Vogel H, Botchie S, Ayi I, Pawlowic MC, Robinson G, Chalmers RM, Lorenzi H, Grigg ME. 2024. Phylogenomic reconstruction of Cryptosporidium spp. captured directly from clinical samples reveals extensive genetic diversity. bioRxiv.

Troell K, Hallström B, Divne A-M, Alsmark C, Arrighi R, Huss M, Beser J, Bertilsson S. 2016. Cryptosporidium as a testbed for single cell genome characterization of unicellular eukaryotes. BMC Genomics 17:471.

Word count: 8379

Show less

© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Cryptosporidium is a globally endemic parasite genus with over 40 recognized species. While C. hominis and C. parvum are responsible for most human infections, human cases involving other species have also been reported. Furthermore, there is increasing evidence of simultaneous infections with multiple species. Therefore, we devised a new means to identify various species of Cryptosporidium in mixed infections by sequencing a 431 bp amplicon of the 18S rRNA gene encompassing two variable regions. Using the DADA2 pipeline, amplicons were first identified to a genus using the SILVA 132 reference database; then Cryptosporidium amplicons to a species using a custom database. This approach demonstrated sensitivity, successfully detecting and accurately identifying as little as 0.001 ng of C. parvum DNA in a complex stool background. Notably, we differentiated mixed infections and demonstrated the ability to identify potentially novel species of Cryptosporidium both in situ and in vitro. Using this method, we identified Cryptosporidium parvum in Egyptian rabbits with three samples showing minor mixed infections. By contrast, no mixed infections were detected in Egyptian children, who were primarily infected with C. hominis. Thus, this pipeline provides a sensitive tool for Cryptosporidium species-level identification, allowing for the detection and accurate identification of minor variants and mixed infections.

IMPORTANCE

Cryptosporidium is a eukaryotic parasite and a leading global cause of waterborne diarrhea, with over 40 recognized species infecting livestock, wildlife, and people. While we have effective tools for detecting Cryptosporidium in clinical and agricultural water samples, there is still a need for a method that can efficiently identify known species as well as infections with multiple Cryptosporidium species, which are increasingly being reported. In this study, we utilized sequencing of a specific region to develop a sensitive and accurate identification workflow for Cryptosporidium species based on high-throughput sequencing. This method can distinguish between all 40 recognized species and accurately detect mixed infections. Our approach provides a sensitive and reliable means to identify Cryptosporidium species in complex clinical and agricultural samples. This has important implications for clinical diagnostics, biosurveillance, and understanding disease transmission, ultimately benefiting clinicians and produce growers.

Details

Title

Amplicon sequencing detects, identifies, and quantifies minority variants in mixed-species infections of Cryptosporidium parasites

Author

Turner, Randi¹

; Naguib, Doaa²; Pierce, Elora³; Li, Alison⁴; Valente, Matthew⁵; Glenn, Travis C⁶; Rosenthal, Benjamin M⁵; Kissinger, Jessica C⁶; Khan, Asis⁵

¹ Agricultural Research Service, United States Department of Agriculture (USDA), Beltsville, Maryland, USA, University of Georgia, Athens, Georgia, USA
² Agricultural Research Service, United States Department of Agriculture (USDA), Beltsville, Maryland, USA, Department of Hygiene and Zoonoses, Faculty of Veterinary Medicine, Mansoura University, Mansoura, Egypt
³ Mississippi State University College of Veterinary Medicine, Mississippi State, Mississippi, USA
⁴ Boston University, Boston, Massachusetts, USA
⁵ Agricultural Research Service, United States Department of Agriculture (USDA), Beltsville, Maryland, USA
⁶ University of Georgia, Athens, Georgia, USA

Section

Research Article

Publication year

2025

Publication date

Oct 2025

Publisher

American Society for Microbiology

ISSN

21612129

e-ISSN

21507511

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1128/mbio.01109-25

ProQuest document ID

3260774153