Introduction
Impaired fasting glucose, also referred to as prediabetes, is a risk factor for cardiovascular disease and type 2 diabetes (T2D) [1, 2]. Investigating the genetic architecture of fasting glucose will lead to a better understanding of the mechanisms involved in glucose homeostasis and subsequently the pathophysiology of T2D [3]. Genetic analysis of fasting glucose as a quantitative trait complements genetic analysis of T2D as a dichotomous trait.
Genome-wide association studies (GWAS) have been widely used in investigating the genetic architecture of fasting glucose levels. Genetic associations with fasting glucose have been reported in 17 loci in individuals of European ancestry [3–5]. There are more than 240 published loci associated with T2D [6, 7]. Only nine T2D loci (GCKR, GCK, SLC30A8, PROX1, ADCY5, DGKB, GLIS3, TCF7L2, and MTNR1B) overlap with fasting glucose loci, which appear to mediate impairment of the glucose-sensing machinery in pancreatic β islet cells [3]. One trivial explanation is low power. Alternatively, loci affecting physiological levels of fasting glucose among normoglycemic individuals need not be the same as loci that affect pathophysiological levels of fasting glucose when hyperglycemic individuals are also considered. As the genetic architectures of fasting glucose and T2D are incompletely known, we caution against overinterpreting this interim result.
The Atherosclerosis Risk in Communities (ARIC) study is a prospective study of atherosclerosis in middle-aged adults [8]. Previously, a GWAS for the average of four fasting glucose measurements taken over nine years was conducted in individuals without prevalent diabetes, and three known loci near MTNR1B (rs10830963), GCK (rs2971669), and G6PC2 (rs853787) were replicated [4]. Here, we defined the outcome as the first fasting glucose measurement from all untreated individuals, i.e., non-diabetic individuals as well as untreated diabetic individuals. We then performed a GWAS using a linear mixed model with a high-density imputation reference panel and identified three associations in loci previously reported to influence fasting glucose (GCKR, G6PC2, and SLC30A8). Associations at two missense variants in GCKR (rs1260326) and SLC30A8 (rs13266634) were identified in individuals with European ancestry and all three associations replicated in trans-ethnic meta-analysis. These three associations also affect risk of T2D, indicating not just physiological relevance to fasting glucose levels but also pathophysiological relevance to T2D.
Materials and methods
The Atherosclerosis Risk in Communities study is a prospective study of clinical atherosclerotic diseases [8]. Individual-level genotype and phenotype data were obtained by authorized access to dbGaP (https://www.ncbi.nlm.nih.gov/gap/). T2D case status was defined as fasting glucose ≥7.0 mmol/L, self-report of a diagnosis by a physician, or current diabetic treatment. For fasting glucose analysis, individuals without T2D (8,902) and with untreated T2D (330) were used; individuals without diabetic treatment were included because their fasting glucose values were unaffected by treatment. The inclusion of untreated cases makes our analysis more powerful than previous analysis of normoglycemic individuals. Selected variables included age, sex, body mass index (BMI), fasting glucose, and T2D status. Among individuals with a reported race of White, a total of 9,232 individuals without T2D or with untreated T2D were included and used for analysis of fasting glucose. Similarly, a total of 9,731 individuals were used for analysis of T2D.
Fasting serum samples were assayed for glucose and were measured on the Roche Hitachi 911 analyzer using the hexokinase method (Roche Diagnostics). Age, sex, race, and ethnicity were self-reported. BMI was calculated as body weight (in kilograms) divided by height (in meters) squared. Medication history over a period of two weeks prior to the visit was verified by review of medication containers that participants brought to the visit.
Genotyping and imputation
Genotyping was performed on the Affymetrix Genome-wide Human SNP Array 6.0. After quality control for minor allele frequency (MAF) ≥0.01, genotype call rate ≥0.95, per-individual missingness rate ≤0.05, and a Hardy-Weinberg equilibrium test p-value >10−6, we retained 800,099 autosomal SNPs. Imputation was performed using the Sanger Imputation Service (https://imputation.sanger.ac.uk/) with the IMPUTE2 software [9] and the 1000 Genomes Project Phase 3 reference panel [10]. The resulting imputed SNPs were filtered for MAF ≥0.01 and info score ≥0.7 [11]. After filtering, 7,896,808 SNPs were retained for association analysis. Coordinates were based on the hg19 build. All alleles are reported with respect to the positive strand.
Association analysis
Fasting glucose levels from the first available measurement were included (S1 Fig). Association analyses were performed using a two-stage linear mixed model and an additive genetic model. In Stage 1, residuals were obtained from a regression of fasting glucose on age, sex, and BMI. The resulting residuals were ranked and inverse normalized. In Stage 2, SNP association was tested by regressing the values from Stage 1 on imputed dosages, adjusted for three significant principal components obtained from the R package SNPRelate (version 1.28.0) [12] as fixed effects and cryptic relatedness as a random effect using the emmax test in EPACTS (version 3.3.0) [13]. The genome-wide significance level α was declared to be 5×10−8. To test for secondary signals, the analysis in Stage 2 was repeated with the inclusion of genome-wide significant SNPs as covariates. R (version 4.0.3) was used in the analyses [14].
Replication analysis
The Multi-Ethnic Study of Atherosclerosis (MESA) [15] and the Framingham Heart Study (FHS) [16] are prospective studies designed to identify risk factors for subclinical atherosclerosis. Individual-level genotype and phenotype data were obtained by authorized access to dbGaP. The China America Diabetes Mellitus (CADM) study is a case-control study of T2D in China [17]. The Africa America Diabetes Mellitus (AADM) study is a case-control study of type 2 diabetes in Africans [18, 19]. The Howard Family University Study (HUFS) is a population-based cross-sectional study of African Americans in Washington, D.C. [20].
For fasting glucose/type 2 diabetes analysis, we aggregated data from 2,204/2,314 European Americans, 632/697 Chinese Americans, 1,080/1290 Hispanic Americans, and 1,206/1,407 African Americans from MESA; 2,211/4,378 West Africans from AADM; 1,548/1,754 African Americans from HUFS; and 2,430/2,605 African Americans from ARIC; 985/1,883 Chinese from CADM; and 2,007/2,061 from FHS, totaling 14,303/18,389 individuals, respectively. Genotype data comprised approximately one million SNPs using the Affymetrix Genome-wide Human SNP Array 6.0 (ARIC, MESA, and HUFS) or two million SNPs using the Affymetrix Axiom Genome-wide PanAFR Array (AADM). Affymetrix Axiom® Exome Genotyping arrays (~ 300,000 markers) were used in CADM. The Illumina HumanOmni5M array (~4.3M markers) was used in FHS. For in-house data sets (AADM, HUFS, and CADM) for which we collected individual-level data from study participants, we performed sex checks. For the data sets from dbGaP (ARIC, FHS, and MESA), we relied on documentation available within dbGaP. Quality control, genotype imputation, transformation of fasting glucose levels, covariates (including three significant principal components and cryptic relatedness), and association testing were the same as for the discovery analysis. We performed inverse variance-weighted fixed effects meta-analysis using METAL [21]. Coordinates were based on the hg19 build.
Variant annotation
We used SnpEff to annotate variant effects [22]. SnpEff integrates with other tools in sequencing data analysis pipelines and contains two steps, variant annotation and effect prediction. Variant annotation datasets were built using a reference genome (hg19). Two methods, SNAP [23] and I-Mutant3 [24], were used to assess discriminative power, a raw numerical score reflecting direction and reliability of the prediction, for each SNP. Discriminative power is the distance of the actual prediction to the decision boundary (score = 0), which reflects the reliability of the prediction and the severity of the predicted effects [25].
PolyPhen-2 is a tool that predicts the possible impact of an amino acid substitution on the structure and function of a human protein [26]. SIFT is a tool that predicts amino acid changes that affect protein function, distinguishing between functionally neutral and deleterious amino acid changes [27]. Combined Annotation Dependent Depletion (CADD) is a tool for scoring the deleteriousness of variants in the human genome [28, 29]. CADD integrates multiple annotations to generate scores that strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured regulatory effects and that highly rank causal variants. Polyphen-2, SIFT, and CADD scores were all retrieved from Ensembl 104 [30].
Fine mapping
Region fine mapping was performed using the R package CAVIARBF (version 0.2.1), an approximate Bayesian method that can incorporate functional annotation [31]. Minimal data requirements are marginal statistical test results and linkage disequilibrium between SNPs. SNPs with MAF ≥ 0.05 within the gene region ±150kb were selected. SNP annotations were coded for the absence (0) or presence (1) of promoter histone marks, enhancer histone marks, DNAse I hypersensitive sites, or bound proteins as provided by HaploReg v4.1 [32]. Bayes factors were calculated conditional on a maximum number of causal SNPs. The estimated Bayes factors and prior probabilities were then used to estimate the posterior inclusion probabilities.
Additive association evaluation
Linear regression and logistic regression were used to determine the joint additive effect across associated independent loci for fasting glucose levels and T2D status, respectively. Whereas rank-based transformations cannot be back transformed, we log-transformed fasting glucose levels in order to be able to obtain effect sizes in original units. We regressed traits on the number of effect alleles, with adjustment for age, BMI, sex, significant principal components (PCs) by study. The analysis was performed using SAS 9.4 (Cary, NC, USA). The R package meta (version 5.1) [33] was used for meta-analysis with an inverse variance-weighted fixed effects method.
Trait loci annotation
An expression QTL (eQTL) is a genomic locus that affects expression levels of mRNA. A splicing QTL (sQTL) is a genomic locus that affects the expression of RNA isoforms generated by alternative splicing events. We retrieved data on eQTL and sQTL annotations from the Genotype-Tissue Expression (GTEx) Portal (https://gtexportal.org).
Protein structure and function predictions
Based on the protein sequence-to-structure-to-function paradigm, we uploaded translated sequences to the I-TASSER online server (https://zhanggroup.org//I-TASSER/) [34–36]. I-TASSER uses template-based fragment assembly simulations of amino acid sequences to predict three-dimensional protein structures, which are then used to find matches in a protein function database to predict protein functions. The predicted protein structures were viewed and analyzed using PyMol [37].
Ethics statement
Ethical approval for the AADM study was obtained from the National Institutes of Health, the Howard University Institutional Review Board, and from ethics committees in Ghana (University of Ghana Medical School Research Ethics Committee and Kwame Nkrumah University of Science and Technology Committee on Human Research Publication and Ethics), Kenya (Moi Teaching & Referral Hospital/Moi University College of Health Sciences Institutional Research and Ethics Committee), and Nigeria (National Health Research Ethics Committee of Nigeria). Ethical approval for HUFS was obtained from the Howard University Institutional Review Board. Ethical approval for CADM was obtained from the institutional review boards of Howard University, National Institutes of Health, and Suizhou Central Hospital (Suizhou, China). Written informed consent was obtained from each participant. All clinical investigation was conducted according to the principles expressed on the Declaration of Helsinki.
Results
Phenotyping, genotyping, and imputation summaries for all discovery and replication studies are presented in S1 Table and S1 and S2 Figs. Within individuals of European ancestry, males had higher BMI than females. In contrast, BMI was higher in females than males among Africans, African Americans, and Hispanic Americans. Males had higher fasting glucose levels than females in all groups.
The discovery and replication analyses included totals of 9,232 and 14,303 individuals, respectively. Three loci reached genome-wide significance (Fig 1 and S2 Table). The genomic control variance inflation factor indicated no inflation due to population stratification (l = 1.01; S3 Fig). Two of the lead SNPs were missense mutations and the third lead SNP was intronic (Table 1). Regional association and Bayesian fine mapping indicated that rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8) had the highest marginal posterior inclusion probabilities (PIP) in their respective loci (S4 Fig). Conditional on the lead SNPs rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8), no signal remained (S5 Fig). The effect alleles at all three lead SNPs were associated with lower fasting glucose (Table 1). The associations at all three lead SNPs were replicated in overall meta-analysis (S3 Table). Of the two suggestive loci (Fig 1, 5×10−7 ≤ p < 5×10−8), only the association at the locus on chromosome 11 was replicated (S4 Table).
[Figure omitted. See PDF.]
The x-axis represents chromosomal positions, and the y-axis represents -log10(p-value). The two dotted lines represent -log10 (5×10−8) and -log10 (5×10−7), respectively.
[Figure omitted. See PDF.]
The variant rs1260326 (GCKR) is a missense mutation, resulting in a substitution from leucine to proline at position 446. The coding effect of rs1260326 was estimated by SnpEff as moderately important (Table 1) and annotated as tolerated by SIFT and benign by Polyphen-2 [30]. Position 446 in GCKR is located at the interface with GCK (Fig 2). L446 is closer to the middle of the interface whereas L446P is closer to GCK (S6 Fig). The variant rs560887 (G6PC2) is intronic and estimated to have low impact (Table 1). The variant rs13266634 (SLC30A8) is a missense mutation, resulting in a substitution from arginine to tryptophan at position 325, annotated as a moderate change by SnpEff (Table 1) and as tolerated by SIFT and benign by PolyPhen-2. I-TASSER predicted four possible protein structures based on an amino sequence with R325W. The four predicted protein structures were similar to each other, but all were different from wild type (Fig 3) and consistent with a moderate change in protein structure.
[Figure omitted. See PDF.]
The position of interaction in GCKR is L446 (rs1260326, red). Green dotted line presents the proximity of the interface between GCK and GCKR.
[Figure omitted. See PDF.]
The SLC30A8 protein structures for Wild Type (Wt, top) and R325W (rs1326634, bottom). Each amino acid sequence yielded four predicted protein structures called models 1 to 4 for Wt and mutant, respectively. Wt-Model 1 (top left) is the 1st 3D structure predicted by comparative molecular modeling through I-TASSER. Wt-Model 1–4 shows the overlap of the four predicted 3D structures for SLC30A8 wild type (top right). The mutant structures (bottom) are labeled correspondingly.
To determine a best fit model jointly across loci, the three loci and all possible interactions were specified in a full model. Regression with backward selection (SLSTAY = 0.10) was used to eliminate variables (S5 Table). The final model included the three lead SNPs without any interactions. After excluding possible interactions, we found that the effect alleles influence fasting glucose in an additive manner (Fig 4). For each copy of a T allele at any of the three SNPs, an additive effect of -0.012 mmol/L on fasting glucose was identified in the discovery sample (p = 3.0×10−28) and replicated in trans-ethnic meta-analysis (n = 14,303, β = -0.0088, SE = 0.0011, p = 2.15×10−16, S7 Fig). We also estimated the joint additive effect of the three SNPs on the risk of T2D in a total of 28,120 individuals with (n = 4,585) or without (n = 23,535) T2D. The three SNPs were associated with significantly reduced T2D risk, with an odds ratio of 0.93 (95% confidence interval [0.88, 0.98], p = 0.0062, S8 Fig). Notably, none of the individuals with 6 T alleles had T2D, compared to 27% of those with 0 T alleles (Fig 5).
[Figure omitted. See PDF.]
The reference group is homozygous for the reference allele at all three SNPs. At each SNP, the T allele is the allele associated with lower fasting glucose.
[Figure omitted. See PDF.]
The label above each bar provides the number of individuals (% prevalence of T2D). At each SNP, the T allele is the allele associated with protection against T2D.
Discussion
Based on a genome-wide analysis of fasting glucose, we identified three loci (GCKR, G6PC2, and SLC30A8) that are involved in glucose regulation as previously reported [4, 38]. Here, we showed that the joint effect of these loci was associated with lower fasting glucose levels as well as lower risk of T2D. The missense SNP rs1260326 in GCKR is significantly associated with fasting glucose in non-diabetic and untreated diabetic individuals with European ancestry. This association was replicated in trans-ethnic meta-analysis of European Americans, Chinese, Chinese Americans, Hispanic Americans, African Americans, and Africans. The SNP rs1260326 has been associated with fatty liver, triglycerides, and very low-density lipoprotein cholesterol in obese children and adolescents [39]. The position in GCKR changed by rs1260326 interacts with GCK; the mutation leads to reduced capability to response to fructose-6-phosphate, increased GCK activity in the liver, and reduced glucose levels [40–42].
G6PC2 (rs560887) has been reported to be associated with fasting glucose [40, 43] and with the 30 min. incremental insulin response in the oral glucose tolerance test [44]. The encoded protein allows the release of glucose into the bloodstream. rs560887 is an expression QTL (eQTL) for G6PC2 in several tissues but most strongly in subcutaneous adipose tissue, with the alternate allele associated with lower gene expression (S6 Table). It is also a splicing QTL (sQTL) for NOSTRIN in several tissues (S7 Table). NOSTRIN binds the enzyme responsible for production of nitric oxide, which is involved in neurotransmission, inflammatory responses, and vascular homeostasis [45]. An effect on NOSTRIN could explain the association of rs560887 with pulse pressure and other phenotypes [46]. The SNP rs560887 is in strong LD with rs573225 (r2 = 0.90) in EUR but weaker LD in AFR (r2 = 0.60) [32]. rs573225 is 207 bp upstream of G6PC2. Like rs560887, rs573225 was associated with lower fasting glucose (β = -0.010, SE = 0.002, p = 6.57×10−8) in our discovery study and was replicated (β = -0.005, SE = 0.0018, p = 0.0031). Also like rs560887, rs573225 is an eQTL for G6PC2 (S6 Table) and an sQTL for NOSTRIN (S7 Table). However, rs573225 has a phred-scaled CADD score of 15.97, compared to 0.210 for rs560887, indicating that rs573225 is more strongly deleterious than rs560887 [30]. rs573225 maps to the highly conserved 2nd position of a predicted regulatory motif for HNF4, with the alternate allele associated with weaker binding of HNF4 [32] and lower expression of G6PC2 (S6 Table). Thus, annotations not included in the fine mapping analysis (specifically, CADD scores and predicted regulatory motifs) provide evidence that rs573225 might be a better candidate causal variant and that rs560887 might simply be tagging rs573225.
We found that the missense variant rs13266634 in SLC30A8 was associated with fasting glucose levels and was previously reported to be associated with T2D risk as well as glucose and proinsulin levels [3, 47]. The T allele at rs13266634 is associated with enhanced insulin secretion from pancreatic β cells and inhibited hepatic insulin clearance, leading to increased peripheral insulin levels and decreased peripheral glucose levels [48]. SLC30A8 is a transmembrane transporter, with the ligand zinc binding to a histidine-rich region from positions 197 to 205 [49]. The position in SLC30A8 changed by rs13255534, position 325, is located on the surface of the protein and maps to the cytoplasmic tail at a point where the protein bends back on itself [49]. Therefore, rs13266634 might not affect binding affinity but might affect either protein stability or interaction with other cytoplasmic components of the transport process.
Functional studies that follow up on findings of genetic associations are critical. One way to assess function is based on analysis of predicted amino acid sequences [50]. Two of the three genetic variants identified in our study were missense. Wild type and mutant amino acid sequences were uploaded onto the I-TASSER server and predicted protein structures were imported into PyMOL for predicted protein function. Moderate protein structure differences were predicted at both rs1260326 (GCKR) and rs13266634 (SLC30A8), leading to predicted changes in protein function. The protein structures modeled by I-TASSER suggest that both rs1260326 and rs13266634 have the potential to change the corresponding protein structures and functions, which might result in altered glucose levels. The predicted structure of GCKR revealed that position 446 is located at the proximity of the interface between GCK and GCKR; therefore, L446P could affect the relative positioning of GCK and GCRK at the interface. This alteration could potentially impact the interaction efficiency of the two proteins, which can be assessed in vitro through either immunoprecipitation or fluorescence resonance energy transfer. Mutations in mice can be created using CRISPR editing technology so that the functional impacts of both GCKR-L446P and SLC30A8-R325W mutations could be tested in vivo. Structural information can also facilitate the rational design and development of targeted drugs and antibodies.
An intergenic locus on chromosome 11 33.4 kb upstream of MTNR1B reached suggestive levels of significance in the discovery study and was replicated. There are two variants with r2≥0.8 in Europeans for the lead SNP rs6483204: rs3847554 and rs6483205 [32]. The variant rs3847554 has been previously reported as associated with fasting plasma glucose [51], but the association at rs3847554 did not replicate in our study due to heterogeneous effect sizes. The variants rs6843204 and rs3847554 are eQTLs for SLC36A4 in esophagus mucosa. SLC36A4 is a non-proton-coupled amino acid transporter. There is no evidence based on histone marks, proteins bound, or binding motifs that rs6483204 could be causal [32]. For rs3847554, the only evidence is a change in a binding motif for CDCL5 [32].
Conclusions
We analyzed GWAS data with 23,535 individuals, either nondiabetics or untreated diabetics, and identified and replicated three independent SNPs in GCKR, G6PC2, and SLC30A8 associated with fasting glucose levels. Each copy of the alternate allele at any of these three SNPs was associated with a reduction of 0.012 mmol/L in fasting glucose. The alternate allele at rs1260326 (GCKR) is associated with increased glycolysis, the alternate allele at rs560887 (G6PC2) is associated with decreased gluconeogenesis, and the alternate allele at rs13266634 (SLC30A8) is associated with increased insulin secretion. Each copy of the alternate allele at any of the three SNPs was associated with a 7% reduced risk of T2D, indicating that the associations are not just physiologically relevant but also pathophysiologically relevant.
Supporting information
S1 Fig. Density plots for fasting glucose (mmol/L) in the discovery study.
Untransformed (left) and log-transformed (right).
https://doi.org/10.1371/journal.pone.0269378.s001
S2 Fig. Density plots for log transformed fasting glucose (mmol/L) in the replication studies.
WHT, CHI, HIS, and AA refer to European Americans, Chinese, Hispanic Americans, and African Americans, respectively.
https://doi.org/10.1371/journal.pone.0269378.s002
S3 Fig. Quantile-quantile plot of p-values for fasting glucose levels in individuals with European ancestry in ARIC.
The x-axis represents expected p-values, and the y-axis represents observed p-values. All p-values are transformed as–log10(p-value).
https://doi.org/10.1371/journal.pone.0269378.s003
S4 Fig. Three panels represent GCKR, G6PC2, and SLC30A8, respectively.
(Top) Region association plot: The x-axis represents position in Mb. The y-axis represents -log10 p-values. Sky-blue lines represent recombination rates (cM/Mb) from the 1000 Genomes Project. (Bottom) Posterior inclusion probabilities (PIP) based on fine mapping. The x-axis represents position in Mb. The y-axis represents PIP values.
https://doi.org/10.1371/journal.pone.0269378.s004
S5 Fig. Genome-wide conditional analysis of fasting glucose in individuals with European ancestry.
Row 1: Conditioning on rs1260326 (GCKR) abolished the peak at GCKR. Row2: Conditioning on rs1260326 (GCKR) and rs13266634 (SLC30A8) abolished the peaks at GCKR and SLC30A8. Row 3: Conditioning on rs1260326 (GCKR), rs560887 (G6PC2), and rs13266634 (SLC30A8) eliminated all genome-wide significant signals.
https://doi.org/10.1371/journal.pone.0269378.s005
S6 Fig. Model structure of GCK with GCKR wild type (green) and L446P mutant (red) complexes.
https://doi.org/10.1371/journal.pone.0269378.s006
S7 Fig. Forest plot from meta-analysis of fasting glucose levels in nine replication studies (n = 14,303).
https://doi.org/10.1371/journal.pone.0269378.s007
S8 Fig. Forest plot from meta-analysis of risk of T2D in discovery and replication studies (n = 28,120).
https://doi.org/10.1371/journal.pone.0269378.s008
S1 Table. Study characteristics for discovery and replication studies.
https://doi.org/10.1371/journal.pone.0269378.s009
(XLSX)
S2 Table. Genome-wide significant association results in the discovery analysis.
https://doi.org/10.1371/journal.pone.0269378.s010
(XLSX)
S3 Table. Trans-ethnic replication analysis results.
https://doi.org/10.1371/journal.pone.0269378.s011
(XLSX)
S4 Table. Genome-wide suggestive association results.
https://doi.org/10.1371/journal.pone.0269378.s012
(XLSX)
S5 Table. Parameter estimation in backward regression.
https://doi.org/10.1371/journal.pone.0269378.s013
(XLSX)
S6 Table. GTEx expression QTL data.
https://doi.org/10.1371/journal.pone.0269378.s014
(XLSX)
S7 Table. GTEx splicing QTL data.
https://doi.org/10.1371/journal.pone.0269378.s015
(XLSX)
Acknowledgments
The Atherosclerosis Risk in Communities study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, National Institute of Health, Department of Health and Human Services, under contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I. The authors thank the staff and participants of the ARIC study for their important contributions. Funding for ARIC Gene Environment Association Studies (GENEVA) was provided by National Human Genome Research Institute grant U01HG004402 (E. Boerwinkle). The datasets used for the analyses in this manuscript were obtained from dbGaP through dbGaP accession study number phs000280.v5.p1. MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts N01-HC95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-RR-025005, and UL1-TR-000040. Funding for SHARe genotyping was provided by NHLBI Contract N02-HL-64278. Genotyping was performed at Affymetrix (Santa Clara, California, USA) and the Broad Institute of Harvard and MIT (Boston, Massachusetts, USA) using the Affymetrix Genome-Wide Human SNP Array 6.0. This manuscript was not prepared in collaboration with MESA investigators and does not necessarily reflect the opinions or views of MESA, or the NHLBI. The datasets used for the analyses in this manuscript were obtained from dbGaP through dbGaP accession study number phs000209.v13.p3. The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University (Contract No. N01-HC-25195, HHSN268201500001I and 75N92019D00031). This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or NHLBI. The datasets used for the analyses in this manuscript were obtained from dbGaP through dbGaP accession study number phs000007.v32.p13. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from the GTEx Portal (V8) on 04/10/21. The Howard University Family Study (HUFS), the China America Diabetes Mellitus (CADM) study, and the Africa America Diabetes Mellitus (AADM) study are in-house studies and supported by the National Human Genome Research Institute, National Institutes of Health. This work utilized the computational resources of the NIH HPC Biowulf cluster (https://hpc.nih.gov). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official view of the National Institutes of Health. This research was supported by the Intramural Research Program of the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, and the Office of the Director at the National Institutes of Health (1ZIAHG200362).
Citation: Chen G, Shriner D, Zhang J, Zhou J, Adikaram P, Doumatey AP, et al. (2022) Additive genetic effect of GCKR, G6PC2, and SLC30A8 variants on fasting glucose levels and risk of type 2 diabetes. PLoS ONE 17(6): e0269378. https://doi.org/10.1371/journal.pone.0269378
About the Authors:
Guanjie Chen
Contributed equally to this work with: Guanjie Chen, Daniel Shriner
Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
https://orcid.org/0000-0002-5745-5618
Daniel Shriner
Contributed equally to this work with: Guanjie Chen, Daniel Shriner
Roles: Methodology, Resources, Software, Writing – review & editing
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
https://orcid.org/0000-0003-1928-5520
Jianhua Zhang
Roles: Software, Visualization, Writing – original draft, Writing – review & editing
Affiliation: Metabolic Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, Maryland, United States of America
Jie Zhou
Roles: Methodology, Resources, Software, Validation, Visualization
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
Poorni Adikaram
Roles: Resources, Software
Affiliation: Advanced BioScience Laboratories, Rockville, Maryland, United States of America
Ayo P. Doumatey
Roles: Investigation, Validation, Visualization
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
Amy R. Bentley
Roles: Resources, Writing – original draft, Writing – review & editing
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
Adebowale Adeyemo
Roles: Software, Supervision, Validation, Visualization, Writing – review & editing
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
Charles N. Rotimi
Roles: Investigation, Supervision, Writing – original draft, Writing – review & editing
E-mail: [email protected]
Affiliation: Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, Maryland, United States of America
1. DeFronzo RA, Abdul-Ghani M. Assessment and treatment of cardiovascular risk in prediabetes: impaired glucose tolerance and impaired fasting glucose. Am J Cardiol. 2011;108:3B–24B. pmid:21802577
2. Göke B. Implications of blood glucose, insulin resistance and beta-cell function in impaired glucose tolerance. Diabetes Res Clin Pract. 1998;40:S15–S20. pmid:9740497
3. Dupuis J, Langenberg C, Prokopenko I, Saxena R, Soranzo N, Jackson AU, et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet. 2010;42:105–116. pmid:20081858
4. Rasmussen-Torvik L, Alonso A, Li M, Kao W, Kottgen A, Yan Y, et al. Genome-Wide Association Study of Repeated Fasting Glucose Measures; the ARIC Study. Diabetes. 2009;58:A307.
5. Rasmussen-Torvik LJ, Guo X, Bowden DW, Bertoni AG, Sale MM, Yao J, et al. Fasting glucose GWAS candidate region analysis across ethnic groups in the Multiethnic Study of Atherosclerosis (MESA). Genet Epidemiol. 2012;36:384–391. pmid:22508271
6. Scott RA, Scott LJ, Maegi R, Marullo L, Gaulton KJ, Kaakinen M, et al. An Expanded Genome-Wide Association Study of Type 2 Diabetes in Europeans. Diabetes. 2017;66:2888–2902. pmid:28566273
7. Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet. 2018;50:1505–1513. pmid:30297969
8. The ARIC Investigators. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. Am J Epidemiol. 1989;129:687–702. pmid:2646917
9. Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLOS Genet. 2009;5:e1000529. pmid:19543373
10. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68–74. pmid:26432245
11. Verma SS, de Andrade M, Tromp G, Kuivaniemi H, Pugh E, Namjou-Khales B, et al. Imputation and quality control steps for combining multiple genome-wide datasets. Front Genet. 2014;5:370. pmid:25566314
12. Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28:3326–3328. pmid:23060615
13. Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 2010;42:348–354. pmid:20208533
14. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2020.
15. Bild DE, Bluemke DA, Burke GL, Detrano R, Roux AVD, Folsom AR, et al. Multi-Ethnic Study of Atherosclerosis: Objectives and Design. Am J Epidemiol. 2002;156:871–881. pmid:12397006
16. Andersson C, Johnson AD, Benjamin EJ, Levy D, Vasan RS. 70-year legacy of the Framingham Heart Study. Nat Rev Cardiol. 2019;16:687–698. pmid:31065045
17. Chen G, Zhang Z, Adebamowo SN, Liu G, Adeyemo A, Zhou Y, et al. Common and rare exonic MUC5B variants associated with type 2 diabetes in Han Chinese. PLOS ONE. 2017;12:e0173784. pmid:28346466
18. Rotimi CN, Dunston GM, Berg K, Akinsete O, Amoah A, Owusu S, et al. In search of susceptibility genes for type 2 diabetes in West Africa: The design and results of the first phase of the AADM study. Ann Epidemiol. 2001;11:51–58. pmid:11164120
19. Adeyemo AA, Zaghloul NA, Chen GJ, Doumatey AP, Leitch CC, Hostelley TL, et al. ZRANB3 is an African-specific type 2 diabetes locus associated with beta-cell mass and insulin response. Nat Commun. 2019;10:3195. pmid:31324766
20. Adeyemo A, Gerry N, Chen GJ, Herbert A, Doumatey A, Huang HX, et al. A Genome-Wide Association Study of Hypertension and Blood Pressure in African Americans. PLOS Genet. 2009;5:e1000564. pmid:19609347
21. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. pmid:20616382
22. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92. pmid:22728672
23. Bromberg Y, Rost B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007;35:3823–3835. pmid:17526529
24. Capriotti E, Fariselli P, Rossi I, Casadio R. A three-state prediction of single point mutations on protein stability changes. BMC Bioinformatics. 2008;9:S6. pmid:18387208
25. Schaefer C, Rost B. Predict impact of single amino acid change upon protein structure. BMC Genomics. 2012;13:S4. pmid:22759652
26. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249. pmid:20354512
27. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–3814. pmid:12824425
28. Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–315. pmid:24487276
29. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–D894. pmid:30371827
30. Howe KL, Achuthan P, Allen J, Allen J, Alvarez-Jarreta J, Amode MR, et al. Ensembl 2021. Nucleic Acids Res. 2021;49:D884–D891. pmid:33137190
31. Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics. Genetics. 2015;200:719–736. pmid:25948564
32. Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40:D930–D934. pmid:22064851
33. Balduzzi S, Rücker G, Schwarzer G. How to perform a meta-analysis with R: a practical tutorial. Evid-Based Ment Heal. 2019;22:153–160. pmid:31563865
34. Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010;5:725–738. pmid:20360767
35. Yang J, Zhang Y. Protein Structure and Function Prediction Using I-TASSER. Curr Protoc Bioinformatics. 2015;52:5.8.1–5.8.15. pmid:26678386
36. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I-TASSER Suite: protein structure and function prediction. Nat Methods. 2015;12:7–8. pmid:25549265
37. Alexander N, Woetzel N, Meiler J. bcl:: Cluster: A method for clustering biological molecules coupled with visualization in the Pymol Molecular Graphics System. IEEE Int Conf Comput Adv Bio Med Sci. 2011;2011:13–18. pmid:27818847
38. Rasmussen-Torvik LJ, Alonso A, Li M, Kao W, Köttgen A, Yan Y, et al. Impact of Repeated Measures and Sample Selection on Genome-Wide Association Studies of Fasting Glucose. Genet Epidemiol. 2010;34:665–673. pmid:20839289
39. Santoro N, Zhang CK, Zhao HY, Pakstis AJ, Kim G, Kursawe R, et al. Variant in the glucokinase regulatory protein (GCKR) gene is associated with fatty liver in obese children and adolescents. Hepatology. 2012;55:781–789. pmid:22105854
40. Bouatia-Naji N, Rocheleau G, Van Lommel L, Lemaire K, Schuit F, Cavalcanti-Proenca C, et al. A polymorphism within the G6PC2 gene is associated with fasting plasma glucose levels. Science. 2008;320:1085–1088. pmid:18451265
41. Beer NL, Tribble ND, McCulloch LJ, Roos C, Johnson PRV, Orho-Melander M, et al. The P446L variant in GCKR associated with fasting plasma glucose and triglyceride levels exerts its effect through increased glucokinase activity in liver. Hum Mol Genet. 2009;18:4081–4088. pmid:19643913
42. Rees MG, Wincovitch S, Schultz J, Waterstradt R, Beer NL, Baltrusch S, et al. Cellular characterisation of the GCKR P446L variant associated with type 2 diabetes risk. Diabetologia. 2012;55:114–122. pmid:22038520
43. Chen WM, Erdos MR, Jackson AU, Saxena R, Sanna S, Silver KD, et al. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. J Clin Invest. 2008;118:2620–2628. pmid:18521185
44. Li X, Shu YH, Xiang AH, Trigo E, Kuusisto J, Hartiala J, et al. Additive Effects of Genetic Variation in GCK and G6PC2 on Insulin Secretion and Fasting Glucose. Diabetes. 2009;58:2946–2953. pmid:19741163
45. O’Leary NA, Wright MW, Brister JR, Ciufo S, McVeigh DHR, Rajput B, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44:D733–D745. pmid:26553804
46. Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. pmid:30445434
47. Strawbridge RJ, Dupuis J, Prokopenko I, Barker A, Ahlqvist E, Rybin D, et al. Genome-Wide Association Identifies Nine Common Variants Associated With Fasting Proinsulin Levels and Provides New Insights Into the Pathophysiology of Type 2 Diabetes. Diabetes. 2011;60:2624–2634. pmid:21873549
48. Fukunaka A, Fujitani Y. Role of Zinc Homeostasis in the Pathogenesis of Diabetes and Obesity. Int J Mol Sci. 2018;19:476. pmid:29415457
49. Consortium UniProt. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 2021;49:D480–D489. pmid:33237286
50. Gallagher MD, Chen-Plotkin AS. The Post-GWAS Era: From Association to Function. Am J Hum Genet. 2018;102:717–730. pmid:29727686
51. Hwang JY, Sim X, Wu Y, Liang J, Tabara Y, Hu C, et al. Genome-wide association meta-analysis identifies novel variants associated with fasting plasma glucose in East Asians. Diabetes. 2015;64:291–298. pmid:25187374
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication: https://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Impaired glucose tolerance is a major risk factor for type 2 diabetes (T2D) and several cardiometabolic disorders. To identify genetic loci underlying fasting glucose levels, we conducted an analysis of 9,232 individuals of European ancestry who at enrollment were either nondiabetic or had untreated type 2 diabetes. Multivariable linear mixed models were used to test for associations between fasting glucose and 7.9 million SNPs, with adjustment for age, body mass index (BMI), sex, significant principal components of the genotypes, and cryptic relatedness. Three previously discovered loci were genome-wide significant, with the lead SNPs being rs1260326, a missense variant in GCKR (p = 1.06×10−8); rs560887, an intronic variant in G6PC2 (p = 3.39×10−11); and rs13266634, a missense variant in SLC30A8 (p = 4.28×10−10). Fine mapping, genome-wide conditional analysis, and functional annotation indicated that the three loci were independently associated with fasting glucose. Each copy of an alternate allele at any of these three SNPs was associated with a reduction of 0.012 mmol/L in fasting glucose levels (p = 8.0×10−28), and this association was replicated in trans-ethnic analysis of 14,303 individuals (p = 2.2×10−16). The three SNPs were jointly associated with significantly reduced T2D risk, with an odds ratio (95% CI) of 0.93 (0.88, 0.98) per protective allele. Our findings implicate additive effects across pathophysiological pathways involved in type 2 diabetes, including glycolysis, gluconeogenesis, and insulin secretion. Since none of the individuals homozygous for the alternate alleles at all three loci has T2D, it might be possible to use a genetic predictor of fasting glucose levels to identify individuals at low vs. high risk of developing type 2 diabetes.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer