Full text

Turn on search term navigation

© 2020 Holland et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Of particular interest are estimating the proportion of common SNPs from a reference panel (polygenicity) involved in any particular phenotype; their effective strength of association (discoverability, or causal effect size variance); the proportion of variation in susceptibility, or phenotypic variation, captured additively by all common causal SNPs (approximately, the narrow sense heritability), and the fraction of that captured by genome-wide significant SNPs—all of which are active areas of research [2–9]. Here, in a unified approach explicitly taking into account LD, we present a model relying on genome-wide association studies (GWAS) summary statistics (z-scores for SNP associations with a phenotype [16]) to estimate polygenicity (π1, the proportion of causal variants in the underlying reference panel of approximately 11 million SNPs from a sample size of 503) and discoverability (, the causal effect size variance), as well as elevation of z-scores due to any residual inflation of the z-scores arising from variance distortion (, which for example can be induced by cryptic relatedness), which remains a concern in large-scale studies [10]. The problem then is finding the three model parameters that give a maximum likelihood best fit for the model’s prediction of the distribution of z-scores to the actual distribution of z-scores. Because we are fitting three parameters typically using ≳106 data points, it is appropriate to incorporate some data reduction to facilitate the computations. Specifically, we posit a normal distribution for β with variance given by a constant, :(3) This is also how the β are distributed across the set of causal SNPs. [...]taking into account all SNPs (the remaining ones are all null by definition), this is equivalent to the two-component Gaussian mixture model we originally proposed [20](4)where is the Dirac delta function, so that considering

Details

Title
Beyond SNP heritability: Polygenicity and discoverability of phenotypes estimated with a univariate Gaussian mixture model
Author
Holland, Dominic  VIAFID ORCID Logo  ; Frei, Oleksandr  VIAFID ORCID Logo  ; Rahul Desikan † Deceased.; Chun-Chieh Fan  VIAFID ORCID Logo  ; Shadrin, Alexey A  VIAFID ORCID Logo  ; Smeland, Olav B  VIAFID ORCID Logo  ; Sundar, V S  VIAFID ORCID Logo  ; Thompson, Paul  VIAFID ORCID Logo  ; Andreassen, Ole A  VIAFID ORCID Logo  ; Dale, Anders M  VIAFID ORCID Logo 
First page
e1008612
Section
Research Article
Publication year
2020
Publication date
May 2020
Publisher
Public Library of Science
ISSN
15537390
e-ISSN
15537404
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2479457848
Copyright
© 2020 Holland et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.