Citation: Thébaud G, Michalakis Y (2016) Comment on "Large Bottleneck Size in Cauliflower Mosaic Virus Populations during Host Plant Colonization" by Monsion et al. (2008). PLoS Pathog 12(4): e1005512. doi:10.1371/journal.ppat.1005512
Editor: Raul Andino, University of California San Francisco, UNITED STATES
Received: November 12, 2015; Accepted: February 29, 2016; Published: April 14, 2016
Copyright: © 2016 Thébaud, Michalakis. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Our research groups have previously published a study estimating the bottleneck size of Cauliflower mosaic virus (CaMV) during host plant colonization [1]. Two methods were used in that study, both based on the temporal evolution of neutral marker frequencies; one uses estimates of FST, and the other tracks changes in marker frequency variance over time. Here we report that (i) the variance-based method was actually published by Felsenstein [2], a reference we were not aware of when publishing the original study; (ii) equation (4) in [1] is actually an approximation; and (iii) these methods rely on the assumption that the bottleneck size is constant, which can be relaxed by assuming it is a Poisson-distributed random variable.
Equation (4) in [1] reads Nv = p(1-p)/[Var(p')-Var(p)], where Nv is the estimated bottleneck size, and p and p' represent the frequency of a neutral marker before and after the bottleneck, respectively. The approximation in this equation concerns the numerator. Its exact formulation should be E(p(1 - p)) and not E(p)(1 - E(p)), which is the expression implied in [1] and used in the numerical application on CaMV (E denotes the expected value). Based on the definition of the variance, it can be shown that E(p(1 - p)) = E(p)(1 - E(p)) - Var(p), which is equivalent to the numerator of equation (14) in [2]. Thus, the approximation will give relatively accurate results whenever Var(p) is negligible relative to E(p)(1 - E(p)). This was indeed the case in the study by Monsion et al. [1], as can be seen in Table 1, which compares the Nv values published in [1] to those obtained when applying the exact expression. The results of that paper are thus not qualitatively affected and, consequently, its conclusions remain unchanged. This method was also used in another study [3], the results of which are also marginally affected quantitatively, but not at all qualitatively (Table 2). To illustrate the application of these methods in a situation in which the estimated bottleneck size is lower, we used data from Tromas et al. (Table 3) [4].
Table 1. Comparison of estimates of Nv using the approximate formula published in [1] (values in Table 2 of this publication), those using the exact expression, and those assuming that bottleneck size is a zero-truncated-Poisson-distributed random variable.
http://dx.doi.org/10.1371/journal.ppat.1005512.t001
Table 2. Comparison of estimates of Nv published in [3] using the approximate formula (values in table S1 of [3]), those using the exact expression, and those assuming that bottleneck size is a zero-truncated-Poisson-distributed random variable.
Note that due to copying errors, the values reported in table S1 of [3] for leaves 5 and 21 were slightly wrong--we report here the correct values.
http://dx.doi.org/10.1371/journal.ppat.1005512.t002
Table 3. Estimates of Nv using data published in [4] for leaf level 5.
The authors of that paper had estimated a bottleneck size of 6 (see their figure 5B) using the FST method. To obtain the reported estimates, Leaf 3 and Leaf 5 were considered as the levels before and after the bottleneck, respectively. From the histogram 4A in [4], we calculated a mean frequency of 0.622 for the marker in Leaf 3, and we used the variances in marker frequency reported for Leaf 3 and Leaf 5.
http://dx.doi.org/10.1371/journal.ppat.1005512.t003
Both Felsenstein's formula [2] and its approximation assume that the bottleneck size N is constant (i.e., identical in all the sampled plants). However, we can make the more realistic assumption that N follows a zero-truncated Poisson (ZTP) distribution (i.e., N can vary among experimental replicates according to a Poisson distribution, but p' cannot be measured when N = 0 and the plant is discarded, hence the zero-truncation):
Among the N genomes that go through the bottleneck, the number X bearing the neutral marker is drawn according to its pre-bottleneck frequency p from the binomial distribution: X~B(N,p). Thus, the marker frequency after the bottleneck is p' = X/N. The variance of this marker frequency can be expressed as: Var(p') = Var[E(p'|N,p)]+E[Var(p'|N,p)].
Because E(p'|N,p) = E(X|N,p)/N and Var(p'|N,p) = Var(X|N,p)/N2, and because N and p are independent, we then get: Var(p') = Var(p)+E[p(1-p)]×E(1/N).
Using the properties and ,
where is the exponential integral function and [gamma] the Euler-Mascheroni constant, one can estimate the bottleneck size nvP by numerically solving the following equation:
We recommend using this estimation procedure when Nv values are below 20, i.e., when the relative estimation error is higher than 5%. The following lines of R code may be used to estimate nvP:
library(gsl)
EulerMaschCst<-(-digamma(1))
p<-c(...,.,...) # vector of initial frequencies
p_prime<-c(...,.,...) # vector of final frequencies
funestim<-function(n_vP) (mean(p)*(1-mean(p))-var(p))/(var(p_prime)-var(p)) -
(exp(n_vP)-1)/(expint_Ei(n_vP)-log(n_vP)-EulerMaschCst)
uniroot(funestim,interval = c(1e-6,600))
Monsion B, Froissart R, Michalakis Y, Blanc S. Large bottleneck size in Cauliflower mosaic virus populations during host plant colonization. PLoS Pathog. 2008;4: e1000174. doi: 10.1371/journal.ppat.1000174. pmid:18846207
Felsenstein J. Inbreeding and variance effective numbers in populations with overlapping generations. Genetics. 1971;68: 581-597. pmid:5166069
Gutiérrez S, Yvon M, Pirolles E, Garzo E, Fereres A, Michalakis Y, et al. Circulating virus load determines the size of bottlenecks in viral populations progressing within a host. PLoS Pathog. 2012;8: e1003009. doi: 10.1371/journal.ppat.1003009. pmid:23133389
Tromas N, Zwart MP, Lafforgue G, Elena SF. Within-host spatiotemporal dynamics of plant virus infection at the cellular level. PLoS Genet. 2014;10: e1004186. doi: 10.1371/journal.pgen.1004186. pmid:24586207
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2016 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Thébaud G, Michalakis Y (2016) Comment on "Large Bottleneck Size in Cauliflower Mosaic Virus Populations during Host Plant Colonization" by Monsion et al. (2008). PLoS Pathog 12(4): e1005512. doi:10.1371/journal.ppat.1005512
Abstract
Based on the definition of the variance, it can be shown that E(p(1 - p)) = E(p)(1 - E(p)) - Var(p), which is equivalent to the numerator of equation (14) in [2]. [...]the approximation will give relatively accurate results whenever Var(p) is negligible relative to E(p)(1 - E(p)). [...]we can make the more realistic assumption that N follows a zero-truncated Poisson (ZTP) distribution (i.e., N can vary among experimental replicates according to a Poisson distribution, but p' cannot be measured when N = 0 and the plant is discarded, hence the zero-truncation): Among the N genomes that go through the bottleneck, the number X bearing the neutral marker is drawn according to its pre-bottleneck frequency p from the binomial distribution: X~B(N,p). [...]the marker frequency after the bottleneck is p' = X/N.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer