Introduction
Malaria cases in Tanzania comprise 3% of globally reported cases, but transmission is heterogeneous, with the coastal mainland witnessing declining but substantial transmission of
Parasite genomics has the potential to help us better understand malaria epidemiology by uncovering population structure and gene flow, providing insight into the changes in the parasite population including how parasites move between regions (Neafsey et al., 2021). Genomics has previously been used to study importation and transmission chains in other low-transmission settings in Africa and elsewhere (Chang et al., 2019; Fola et al., 2023; Morgan et al., 2020; Patel et al., 2014; Roh et al., 2019; Sane et al., 2019). Previously, we had investigated the importation of malaria into Zanzibar from the mainland using whole-genome sequencing, showing highly similar populations within the mainland and within the archipelago, but also identifying highly related parasite pairs between locations, suggesting a role for importation (Morgan et al., 2020). However, this work lacked sufficient samples to assess transmission of parasites within Zanzibar. The larger and spatially rich sample set analyzed in this article offers an opportunity for more refined analyses of transmission across Zanzibar and how parasites are related to those from coastal mainland.
A panel of molecular inversion probes (MIPs), a highly multiplexed genotyping assay, were designed in a previous study to target single-nucleotide polymorphisms (SNPs) throughout the
Methods
Samples from coastal Tanzania (178) and Zanzibar (213) were previously sequenced through multiple studies (Table 1, Figure 1—figure supplement 1). These samples include 213 dried blood spots (DBS) collected in Zanzibar between February 2016 and September 2017, coming from cross-sectional surveys of asymptomatic individuals (n = 70) and an in vivo efficacy study of artesunate-amodiaquine (ASAQ) with single low-dose primaquine (SLDP) in pediatric uncomplicated malaria patients in the western and central districts of Unguja island and Micheweni district on Pemba island (n = 143) (Msellem et al., 2020). These samples were geolocalized to
Table 1.
Blood samples from Zanzibar and coastal Tanzania.
Description | Location (district) | Dates | Clinical status* | Sample size | Age range (yr) | # in genome-wide analysis | # in drug resistance analysis |
---|---|---|---|---|---|---|---|
Community cross-sectional surveys | Zanzibar (multiple) | 2016 | A | 70 | 2–70 | 21 | 52 |
In vivo efficacy study of artesunate-amodiaquine (ASAQ) with single low-dose primaquine (SLDP) in pediatric uncomplicated malaria patients | Zanzibar (multiple) | 2017 | S | 143 | 2–60 | 117 | 134 |
Study of transmission of | Mainland Tanzania (Bagamoyo) | 2018 | A | 40 | 7–16 | 34 | 0 |
Parasite clearance study of artemether-lumefantrine (AL) | Mainland Tanzania (Bagamoyo) | 2018 | S | 138 | 2–11 | 110 | 123 |
*
Asymptomatic (A) or symptomatic (S).
In order to place coastal Tanzanian and Zanzibari samples in the context of African
MIP sequencing
Sequence data for the coastal Tanzanian and Zanzibari samples were generated in a similar fashion across studies. Chelex-extracted DNA from DBS and QIAGEN Miniprep (QIAGEN, Germantown, MD)-extracted DNA from leukodepleted blood were used in MIP captures, which were then sequenced as previously described (Aydemir et al., 2018; Verity et al., 2020). Control mixtures of four strains of genomic DNA from
MIP variant calling and filtering
MIP sequencing data was processed using
For the drug resistance panel, variant calling was performed as above, with additional
Analysis of population relatedness and structure
To investigate genetic relatedness of parasites across regions, IBD estimates were assessed using the within-sample major alleles (coercing samples to monoclonal by calling the dominant allele at each locus) and estimated utilizing a maximum likelihood approach using the
To include geographic information with querying genetic variation, discriminant analysis of principal components (DAPC) was used (Jombart, 2008). Pseudohaplotypes were created by pruning the genotype calls at all loci for each sample into a single haplotype, and redundant haplotypes were removed (282 reduced to 272 with unique pseudohaplotypes). DAPC was conducted at the district level, and samples from districts with less than five samples (272 samples to 270 samples in six districts) were retained (Figure 1—figure supplement 4B). For the main DAPC (Figure 1B), highly related isolates were pruned to a single representative infection (272 reduced to 232) and then included districts with at least five samples (232 reduced to 228 samples in five districts). The DAPC was performed using the
Figure 1.
Parasites between Zanzibar and coastal mainland Tanzania are highly related but microstructure within Zanzibar is apparent.
(A) Principal component analysis (PCA) comparing parasites from symptomatic vs. asymptomatic patients from coastal Tanzania and Zanzibar. Clusters with an identity by descent (IBD) value of > 0.90 were limited to a single representative infection to prevent local structure of highly related isolates within
Figure 1—figure supplement 1.
Sampling locations in Zanzibar (
The centroids of the sampling locations are shown as blue rectangles. The ferry terminal in Zanzibar town is shown as a red rectangle. In Zanzibar, samples were collected throughout Unguja and in northern Pemba. In mainland Tanzania, samples were collected from Bagamoyo district.
Figure 1—figure supplement 2.
Principal component analysis (PCA) utilizing samples across Africa shows clustering based on geographic location.
Samples from Ahero, Kenya (n = 147), a random 20% of samples from five regions across Africa (Verity et al., 2020) (n = 275) and from this study (n = 282) were subsetted to 756 common loci. Within-sample allele frequency (WSAF) was calculated, with an imputation step to replace missing values with the median WSAF, to perform PCA.
Figure 1—figure supplement 3.
Molecular inversion probe (MIP) performance shows coverage of loci.
Panel (A) shows the log-transformed read depth for genome-wide single-nucleotide polymorphisms (SNPs) for samples (columns) and loci (rows). The log-transformed unique molecular identifier (UMI) count ranges from 0 to 9.87. Panel (B) shows the mean UMI coverage for the analyzed drug resistance mutations with a nonparametric bootstrap 95% CI.
Figure 1—figure supplement 4.
Principal component analysis (PCA) with highly related samples shows population stratification radiating from coastal mainland to Zanzibar.
PCA of 282 total samples was performed using whole-sample allele frequency (A) and discriminant analysis of principal components (DAPC) was performed after retaining samples with unique pseudohaplotypes in districts that had five or more samples present (B). As opposed to Figure 1, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection.
To investigate how genetic relatedness varies as the distance between pairs increases, isolation by distance was performed across all of Zanzibar and within the islands of Unguja and Pemba. The greater circle (GC) distances between each
COI, or the number of parasite clones in a given sample, was determined using
Results
Zanzibari falciparum parasites were closely related to coastal mainland parasites but showed higher within- than between-population IBD and evidence of microstructure on the archipelago
To examine geographic relatedness, we first used PCA. Zanzibari parasites are highly related to other parasites from East Africa and more distantly related to Central and West African isolates (Figure 1—figure supplement 2). PCA of 232 coastal Tanzanian and Zanzibari isolates, after pruning 51 samples with an IBD of >0.9 to one representative sample, demonstrates little population differentiation (Figure 1A).
However, after performing K-means clustering of
Figure 2.
Coastal Tanzania and Zanzibari parasites have more highly related pairs within their given region than between regions.
K-means clustering of
Figure 2—figure supplement 1.
Diagnostic plot showing total within-cluster sum of squares versus number of clusters for the determination of optimal K.
Mainland samples were considered an independent cluster. We selected a K of 4 for determining clusters in Zanzibar based on the inflection point above.
To further assess the differentiation within the parasite population in Zanzibar, we conducted DAPC according to the districts of origin for each isolate. Parasites differentiated geographically, with less variation near the port of Zanzibar town and more differentiation in isolates collected in districts further from the port (Figure 1B). This underlying microstructure is also supported by classic isolation by distance analysis (Figure 3, Figure 3—figure supplement 1). Isolation by distance analysis across all of Zanzibar and within Unguja showed rapid decay of relatedness over very short geographic distances (Figure 3A and B). Interestingly in Pemba, mean IBD remained at a similar relatively high level even at longer distances (Figure 3C).
Figure 3.
Isolation by distance is shown between all Zanzibari parasites (A), only Unguja parasites (B), and only Pemba parasites (C).
Samples were analyzed based on geographic location. Zanzibar (N = 136) (A), Unguja (N = 105) (B), or Pemba (N = 31) (C) and greater circle (GC) distances between pairs of parasite isolates were calculated based on
Figure 3—figure supplement 1.
Isolation by distance in Zanzibar isolates (A) and only Unguja isolates (B).
Samples were filtered based on location and greater circle distance were calculated. These distances were binned at 10 km increments. The mean IBD and 95% CI are plotted for each bin.
Within Zanzibar, parasite clones are shared within and between
Among the sample pairs in Zanzibar that are highly related (IBD of ≥0.25), we see different patterns of genetic relatedness suggesting common local and short-distance transmission of clones and occasional long-distance transmission (Figure 4). In Unguja (Figure 4A), we see multiple identical or near-identical parasite pairs shared over longer distances, suggesting longer distance gene flow, as well as multiple
Figure 4.
Highly related pairs span long distances across Zanzibar.
Sample pairs were filtered to have identity by descent (IBD) estimates of ≥ 0.25. Within
Figure 4—figure supplement 1.
Network analysis of within
Pairwise IBD comparisons of ≥ 0.25 within different
Figure 4—figure supplement 2.
Network analysis of sample pairs with an identity by descent (IBD) of ≥0.25 for coastal mainland Tanzania.
The network of highly related (IBD ≥ 0.25) pairs is plotted above within coastal mainland Tanzania. The width of each line represents higher magnitudes of IBD between pairs.
Figure 4—figure supplement 3.
Sample pairs with an identity by descent (IBD) of ≥0.125 between Zanzibar and mainland Tanzania.
Relatively few sample pairs showed moderate levels of IBD (between 0.125 and 0.20) between the coastal mainland and Zanzibar.
Figure 4—figure supplement 4.
Sample pairs with an identity by descent (IBD) of ≥0.125 between Unguja and Pemba.
Relatively few sample pairs showed moderate levels of IBD (between 0.125 and 0.20) between Unguja and Pemba.
Network analysis of within
Compared to symptomatic infections, asymptomatic infections demonstrate greater genetic complexity, especially in coastal Tanzania
Asymptomatic infections were compared to roughly contemporaneously collected isolates from those presenting with acute, uncomplicated malaria. Asymptomatic infections demonstrated greater COI than symptomatic infections on both the coastal mainland (mean COI 2.5 vs 1.7, p<0.05, Wilcoxon–Mann–Whitney test) and in Zanzibar (mean COI 2.2 vs 1.7, p=0.05, Wilcoxon–Mann–Whitney test) (Figure 5A). A similar pattern was seen when evaluating Fws, which measures the diversity within a sample compared to the population, with lower Fws in asymptomatic samples consistent with higher within-host complexity, with a more pronounced difference on the mainland (Figure 5B). Despite these differences, parasites from asymptomatic and symptomatic infections tended to cluster together in PCA, suggesting that their core genomes are genetically similar and do not vary based on clinical status (Figure 1A).
Figure 5.
Complexity of infection (COI) and Fws metric shows a higher COI and lower Fws in asymptomatic than symptomatic infections in both mainland Tanzania and Zanzibar isolates.
COI (A) was estimated using the REAL McCOIL’s categorical method (Chang et al., 2017). The mean COI for asymptomatic was greater than symptomatic infections for all regions; MAIN-A: 2.5 (2.1–2.9), MAIN-S: 1.7 (1.6–1.9), p<0.05, Wilcoxon–Mann–Whitney test and ZAN-A: 2.2 (1.7–2.8), ZAN-S: 1.7 (1.5–1.9), p=0.05, Wilcoxon–Mann–Whitney test. Fws (B) was estimated utilizing the formula,
Drug resistance mutations did not vary between populations
The prevalence of the drug resistance genotypes was quite similar in Zanzibar and coastal Tanzania (Table 2). The frequencies of five mutations associated with sulfadoxine/pyrimethamine resistance (Pfdhfr: N51I, C59R, S108N, Pfdhps: A437G, K540E) were quite high, with prevalences at or above 0.90. Pfcrt mutations associated with chloroquine and amodiaquine resistance (M74I, N75E, K76T) were all present at approximately 0.05 prevalence (Djimdé et al., 2001; Holmgren et al., 2006). For Pfmdr1, wild-type N86 and D1246 were dominant at 0.99 prevalence, which are associated with reduced susceptibility to lumefantrine (Sisowath et al., 2005). No World Health Organization-validated or candidate polymorphism in Pfk13 associated with artemisinin resistance was found.
Table 2.
Drug resistance polymorphism prevalence in Zanzibar and coastal mainland Tanzania.
Mutation | Zanzibar | Mainland Tanzania | ||||
---|---|---|---|---|---|---|
Mutant allele prevalence* | CI† | # Genotyped samples ‡ | Mutant allele prevalence* | CI† | # Genotyped samples ‡ | |
Pfcrt-M74I | 0.054 | 0.026–0.098 | 184 | 0.000 | 0–0.034 | 106 |
Pfcrt-N75E | 0.054 | 0.026–0.098 | 184 | 0.000 | 0–0.034 | 106 |
Pfcrt-K76T | 0.054 | 0.026–0.098 | 184 | 0.000 | 0–0.034 | 106 |
Pfdhfr-A16V | 0.000 | 0–0.021 | 173 | 0.000 | 0–0.032 | 112 |
Pfdhfr-N51I | 0.977 | 0.943–0.994 | 177 | 0.964 | 0.911–0.99 | 112 |
Pfdhfr-C59R | 0.971 | 0.934–0.991 | 174 | 0.945 | 0.884–0.98 | 109 |
Pfdhfr-S108N | 1.000 | 0.98–1 | 179 | 1.000 | 0.965–1 | 104 |
Pfdhfr-S108T | 0.000 | 0–0.02 | 179 | 0.000 | 0–0.035 | 104 |
Pfdhfr-I164L | 0.000 | 0–0.02 | 184 | 0.000 | 0–0.037 | 98 |
Pfdhps-A437G | 1.000 | 0.98–1 | 182 | 1.000 | 0.968–1 | 115 |
Pfdhps-K540E | 0.955 | 0.913–0.98 | 178 | 0.964 | 0.91–0.99 | 111 |
Pfdhps-A581G | 0.044 | 0.019–0.085 | 181 | 0.107 | 0.058–0.175 | 122 |
Pfk13-K189N | 0.023 | 0.006–0.058 | 174 | 0.000 | 0–0.04 | 90 |
Pfk13-K189T | 0.078 | 0.042–0.13 | 166 | 0.095 | 0.042–0.179 | 84 |
Pfmdr1-N86Y | 0.011 | 0.001–0.04 | 180 | 0.008 | 0–0.044 | 124 |
Pfmdr1-Y184F | 0.644 | 0.57–0.714 | 180 | 0.530 | 0.435–0.624 | 115 |
Pfmdr1-D1246Y | 0.011 | 0.001–0.039 | 184 | 0.019 | 0.002–0.067 | 105 |
Pfmdr2-I492V | 0.430 | 0.357–0.506 | 179 | 0.407 | 0.302–0.518 | 86 |
*
Prevalence was calculated as described in the ‘Methods’.
†
95% CI of these polymorphisms were calculated using the Pearson–Klopper method.
‡
The number of genotyped samples per loci is also shown for each polymorphism.
Discussion
In this study, we leverage high-throughput targeted sequencing using MIPs to characterize the populations and the relationships of
Despite the overall genetic similarity between archipelago populations, we did not find parasite pairs with high levels of IBD between the coastal mainland and Zanzibar, with the highest being 0.20. While this level still represents a significant amount of genetic sharing, similar to a cousin, the lack of higher levels does not allow us to identify specific importation events. This is largely due to the study design, which is based on convenience sampling, the relatively low numbers of samples, and lack of sampling from all mainland travel hubs (Bisanzio et al., 2023). Sampling was also denser in Unguja compared to Pemba. On the other hand, we see clear transmission of highly related parasites within each population (IBD > 0.99). In Zanzibar, highly related parasites mainly occur in the range of 20–30 km. These results are similar to our previous work using whole-genome sequencing of isolates from Zanzibar and mainland Tanzania, showing increased within-population IBD compared to between-population IBD (Morgan et al., 2020). The network of highly related
Asymptomatic parasitemia has been shown to be common in falciparum malaria around the globe and has been shown to have increasing importance in Zanzibar (Lindblade et al., 2013; Morris et al., 2015). What underlies the biology and prevalence of asymptomatic parasitemia in very low transmission settings where antiparasite immunity is not expected to be prevalent remains unclear (Björkman and Morris, 2020). Similar to a few previous studies, we found that asymptomatic infections had a higher COI than symptomatic infections across both the coastal mainland and Zanzibar parasite populations (Collins et al., 2022; Kimenyi et al., 2022; Sarah-Matio et al., 2022). Other studies have found lower COI in severe vs. mild malaria cases (Robert et al., 1996) or no significant difference between COI based on clinical status (Conway et al., 1991; Earland et al., 2019; Kun et al., 1998; Lagnika et al., 2022; Tanabe et al., 2015). In Zambia, one study suggested that infections that cause asymptomatic infection may be genetically different from those that cause symptomatic infection (Searle et al., 2017). However, this study included samples collected over different time periods and relied on a low-density genotyping assay that only investigated the diversity of 24 SNPs across the genome. Here, based on SNPs throughout the core genome, we did not see differential clustering of asymptomatic or symptomatic infections in Zanzibar or the mainland (Figure 1A), suggesting that these parasite populations remain similar when comparing clinical status. However, this genotyping approach does not address potential variation in the many hypervariable gene families that encode genes known to be associated with pathogenesis (e.g.,
While mutations for partial artemisinin resistance were not observed in K13, other antimalarial-resistant mutations of concern were observed. Validated drug resistance mutations linked to sulfadoxine/pyrimethamine resistance (Pfdhfr-N51I, Pfdhfr-C59R, Pfdhfr-S108N, Pfdhps-A437G, Pfdhps-K540E) were found at high prevalence (Table 2). Prevalence of polymorphisms associated with amodiaquine resistance (Pfcrt-K76T, Pfmdr1-N86Y, Pfmdr1-Y184F, Pfmdr1-D1246Y) was seen at similar proportions as previous reports (Msellem et al., 2020). The wild-type Pfmdr1-N86 was dominant in both mainland and archipelago populations, concerning reduced lumefantrine susceptibility. Although polymorphisms associated with artemisinin resistance did not appear in this population, continued surveillance is warranted given the emergence of these mutations in East Africa and reports of rare resistance mutations on the coast consistent with the spread of emerging Pfk13 mutations (Moser et al., 2021).
Overall, parasites between Zanzibar and coastal mainland Tanzania remain highly related, but population microstructure on the island reflects ongoing low-level transmission in Zanzibar, partially driven by asymptomatic infections that potentially constitute a long-term reservoir. This is likely the result of the continued pressure on the population through the implementation of effective control measures. In this study, parasite genomics allows us to parse differences in parasite populations and reveals substructure in an area of low-transmission intensity. A recent study identified ‘hotspot’
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023, Connelly et al. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Background:
The Zanzibar archipelago of Tanzania has become a low-transmission area for
Methods:
To shed light on these sources of transmission, we applied highly multiplexed genotyping utilizing molecular inversion probes to characterize the genetic relatedness of 282
Results:
Overall, parasite populations on the coastal mainland and Zanzibar archipelago remain highly related. However, parasite isolates from Zanzibar exhibit population microstructure due to the rapid decay of parasite relatedness over very short distances. This, along with highly related pairs within
Conclusions:
Our data support importation as a main source of genetic diversity and contribution to the parasite population in Zanzibar, but they also show local outbreak clusters where targeted interventions are essential to block local transmission. These results highlight the need for preventive measures against imported malaria and enhanced control measures in areas that remain receptive to malaria reemergence due to susceptible hosts and competent vectors.
Funding:
This research was funded by the National Institutes of Health, grants R01AI121558, R01AI137395, R01AI155730, F30AI143172, and K24AI134990. Funding was also contributed from the Swedish Research Council, Erling-Persson Family Foundation, and the Yang Fund. RV acknowledges funding from the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 program supported by the European Union. RV also acknowledges funding by Community Jameel.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer