About the Authors:
Marina Serper
Roles Conceptualization, Data curation, Methodology, Writing – original draft, Writing – review & editing
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
ORCID logo http://orcid.org/0000-0003-4899-2160
Marijana Vujkovic
Roles Conceptualization, Data curation, Formal analysis, Writing – original draft, Writing – review & editing
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliation: Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America
ORCID logo http://orcid.org/0000-0003-4924-5714
David E. Kaplan
Roles Conceptualization, Methodology, Writing – review & editing
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
ORCID logo http://orcid.org/0000-0002-3839-336X
Rotonya M. Carr
Roles Conceptualization, Methodology, Writing – original draft, Writing – review & editing
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Kyung Min Lee
Roles Data curation
Affiliations Center for Healthcare Organization and Implementation Research, Edith Nourse Rogers Memorial Veterans Hospital, Bedford, Massachusetts, United States of America, Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, Massachusetts, United States of America, VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, Utah, United States of America
Qing Shao
Roles Data curation
Affiliation: Center for Healthcare Organization and Implementation Research, Edith Nourse Rogers Memorial Veterans Hospital, Bedford, Massachusetts, United States of America
Donald R. Miller
Roles Writing – review & editing
Affiliation: Center for Healthcare Organization and Implementation Research, Edith Nourse Rogers Memorial Veterans Hospital, Bedford, Massachusetts, United States of America
Peter D. Reaven
Roles Writing – review & editing
Affiliation: Phoenix VA Health Care System, Phoenix, Arizona, United States of America
Lawrence S. Phillips
Roles Writing – review & editing
Affiliations Department of Veterans Affairs, Atlanta Health Care System, Decatur, Georgia, United States of America, Division of Endocrinology and Metabolism, Department of Medicine, Emory University School of Medicine, Atlanta, Georgia, United States of America
Christopher J. O’Donnell
Roles Resources, Writing – review & editing
Affiliations Massachusetts Veterans Epidemiology Research and Information Center, VA Boston Healthcare System, Boston, Massachusetts, United States of America, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
James B. Meigs
Roles Writing – review & editing
Affiliation: Massachusetts General Hospital, Harvard Medical School and the Broad Institute, Boston, Massachusetts, United States of America
ORCID logo http://orcid.org/0000-0002-2439-2657
Peter W. F. Wilson
Roles Writing – review & editing
Affiliations Department of Veterans Affairs, Atlanta Health Care System, Decatur, Georgia, United States of America, Division of Endocrinology and Metabolism, Department of Medicine, Emory University School of Medicine, Atlanta, Georgia, United States of America
Rachel Vickers-Smith
Roles Methodology, Writing – review & editing
Affiliation: University of Louisville, Louisville, Kentucky, United States of America
ORCID logo http://orcid.org/0000-0002-7224-8916
Henry R. Kranzler
Roles Methodology, Writing – review & editing
Affiliations University of Louisville, Louisville, Kentucky, United States of America, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Amy C. Justice
Roles Writing – review & editing
Affiliations Yale School of Medicine, New Haven, Connecticut, United States of America, Veterans Affairs Connecticut Healthcare System, West Haven, Connecticut, United States of America, Yale School of Public Health, New Haven, Connecticut, United States of America
John M. Gaziano
Roles Resources, Writing – review & editing
Affiliations Massachusetts Veterans Epidemiology Research and Information Center, VA Boston Healthcare System, Boston, Massachusetts, United States of America, Boston University School of Public Health, Boston, Massachusetts, United States of America
Sumitra Muralidhar
Roles Resources, Writing – review & editing
Affiliation: Office of Research and Development, Veterans Health Administration, Washington, DC, United States of America
Saiju Pyarajan
Roles Data curation, Writing – review & editing
Affiliations Massachusetts Veterans Epidemiology Research and Information Center, VA Boston Healthcare System, Boston, Massachusetts, United States of America, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
Scott L. DuVall
Roles Data curation, Writing – review & editing
Affiliations VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, Utah, United States of America, Department of Internal Medicine Division of Epidemiology, University of Utah School of Medicine, Salt Lake City, Utah, United States of America
Themistocles L. Assimes
Roles Writing – review & editing
Affiliations Department of Medicine, Stanford University School of Medicine, Stanford, California, United States of America, VA Palo Alto Health Care System, Palo Alto, California, United States of America
ORCID logo http://orcid.org/0000-0003-2349-0009
Jennifer S. Lee
Roles Writing – review & editing
Affiliations Department of Medicine, Stanford University School of Medicine, Stanford, California, United States of America, VA Palo Alto Health Care System, Palo Alto, California, United States of America
Philip S. Tsao
Roles Funding acquisition, Writing – review & editing
Affiliations Department of Medicine, Stanford University School of Medicine, Stanford, California, United States of America, VA Palo Alto Health Care System, Palo Alto, California, United States of America
Daniel J. Rader
Roles Conceptualization, Writing – review & editing
Affiliations Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Scott M. Damrauer
Roles Conceptualization, Methodology, Writing – review & editing
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Julie A. Lynch
Roles Conceptualization, Data curation, Writing – review & editing
Affiliations VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, Utah, United States of America, College of Nursing and Health Sciences, University of Massachusetts, Boston, Massachusetts, United States of America
Danish Saleheen
Roles Conceptualization, Methodology, Resources, Writing – review & editing
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Benjamin F. Voight
Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, Department of Systems Pharmacology and Translational Therapeutics and Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
ORCID logo http://orcid.org/0000-0002-6205-9994
Kyong-Mi Chang
Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Writing – original draft, Writing – review & editing
* E-mail: [email protected], [email protected]
¶‡ MS, MV and DEK share first authorship on this work. DS, BFV and KMC are joint senior authors on this work.
Affiliations Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
on behalf of the VA Million Veteran Program
¶Membership of The VA Million Veteran Program is provided in the Acknowledgments.
Introduction
Non-alcoholic fatty liver disease (NAFLD) is a heritable, clinically heterogeneous disorder encompassing simple steatosis and non-alcoholic steatohepatitis (NASH) with concomitant cardio-metabolic risk factors [1, 2]. To date, genome-wide association studies (GWAS) for NAFLD and related traits such as serum alanine aminotransferase (ALT) concentration have identified 8 independent genetic loci derived primarily from hepatic lipid and glucose homeostatic genes (LYPLAL1, GCKR, HSD17B13, TRIB1, PPP1R3B, CPN1-ERLIN1-CHUK, TM6SF2, PNPLA3) (S1 Table in S1 File) [3–18]. In particular, the I148M variant of the patatin-like phospholipase domain-containing protein-3 (PNPLA3) gene has been strongly associated with NAFLD, ALT concentration, and alcoholic liver disease. PNPLA3 encodes the calcium-independent phospholipase A2 epsilon (also called adiponutrin) which is enriched in hepatocytes and hepatic stellate cells and has a role in lipid droplet regulation [19]. Additionally, polymorphisms in MBOAT7 and IFNL3/4 have been shown to be associated with hepatic steatosis and necroinflammation [20–22].
Despite our advanced understanding of NAFLD pathogenesis, population-based identification of NAFLD remains a challenge in clinical practice and research [23, 24]. Although liver biopsy is generally considered the gold standard in NAFLD diagnosis [25], it is infrequently performed in routine clinical care due to its invasive nature with poor patient acceptance and sample variability [26]. Conventional ultrasound, though frequently used, has limited sensitivity and specificity, whereas the role of transient elastography continues to emerge [1]. While magnetic resonance imaging (MRI) modalities such as MRI protein-density fat fraction (MRI-PDFF) or Magnetic Resonance Spectroscopy (MRS) can accurately diagnose hepatic steatosis, these technologies are not widely available in routine clinical practice [26, 27]. Current electronic-health record (EHR) based algorithms using diagnosis codes, clinical encounters, and laboratory values have limited sensitivity, underestimate population prevalence, and still require clinician adjudication and labor-intensive medical record review [24, 28, 29]. Additional approaches to NAFLD phenotyping such as natural language processing and machine learning remain areas of active and ongoing investigation [30].
Given these challenges in NAFLD diagnosis, we sought to validate a phenotype of NAFLD using measures that can be readily applied in clinical practice and in population-based investigations. To this end, we leveraged robust clinical and genomic data from the Million Veteran Program (MVP), a multi-ethnic cohort with over 300,000 genotyped Veterans enrolled at 63 Veteran Affairs (VA) medical centers across the United States (US) [31]. Specifically, we used 16 genetic variants from 8 previously reported independent loci associated with NAFLD risk (diagnosed using imaging, liver biopsy and related traits) and EHR review to validate a clinical NAFLD phenotype. The replication of known genetic variant associations was performed in MVP to increase the confidence on the non-invasive ALT-based NAFLD phenotype to facilitate future genetic association studies.
Materials and methods
MVP cohort description
This was a cross-sectional analysis at the time of MVP enrollment using previously collected EHR data. We performed replication analyses using DNA samples and clinical data from the MVP cohort, which has been described previously in detail [31, 32]. All participants provided written informed consent to participate in the study. Consented participants provided a blood sample, answered self-reported baseline and lifestyle questionnaires, and were consented for future contact. Recruitment is ongoing at 63 VA Medical Centers across the US. The cohort is predominantly male and enriched with Veterans of African (AA) and Hispanic/Latino (LA) ancestry as compared to the US population [31]. Prospectively collected questionnaire data were linked with clinical information from the VA EHR via the VA’s central database, the Corporate Data Warehouse (CDW). The MVP core study protocol was approved by the VA Central Institutional Review Board (CIRB) and the Research and Development (R&D) Committees at all 63 participating VA medical centers. Further approval for this specific analysis was obtained from the VA CIRB and from the R&D committees at Bedford, Philadelphia, Palo Alto, Salt Lake City, and Phoenix VA medical centers.
For the current analysis, clinical and genetic data were available from 234,683 European (EU), 64,961 AA, and 22,615 LA participants (S2 Table in S1 File) categorized as mutually-exclusive ancestral groups based on CDW data, self-identified race/ethnicity, and genetically inferred ancestry enrolled in MVP from 2011 until 2016 [33]. Asian American participants were excluded due to small sample size. As shown in S2 Table in S1 File, we further excluded 71,012 participants with the presence of international classification of disease-clinical modification (ICD-9-CM/10-CM) codes for alcoholic liver disease and/or alcohol use disorder (n = 51,549), other chronic viral (n = 7,995) metabolic, cholestatic liver diseases and liver metastases (n = 11,468). For the main analyses, we further excluded 58,631 participants with intermediate ALT values (between 30–40 U/L for men and 20–30 U/L for women) that did not meet threshold ALT cutoffs for NAFLD case or control phenotype, resulting in a final analytic cohort of 192,616 (S2 Table in S1 File, Row C).
NAFLD phenotype definitions
MVP NAFLD phenotype definitions were developed by combining a previously published VA CDW ALT-based approach [24] with non-invasive clinical parameters available to practicing clinicians at the point of care. The primary NAFLD phenotype (“ALT-threshold”) was defined by: (i) elevated ALT >40 U/L for men and >30 U/L for women during at least two time points at least 6 months apart within a two-year window period at any point prior to enrollment and (ii) exclusion of other causes of liver disease (e.g. viral, cholestatic, and hereditary in addition to alcohol-related hepatitis and cirrhosis) and/or alcohol use disorder by ICD-9-CM/10-CM. Another ALT-based phenotype, ABALT, defined as ALT >30 U/L for men, >20 U/L for women was evaluated using EHR validation (EHR validation section).
A secondary NAFLD phenotype (“ALT-metabolic”) combined “ALT-threshold” criteria and at least one metabolic risk factor including obesity with body mass index (BMI) ≥ 30 kg/m2, dyslipidemia (DL), type 2 diabetes mellitus (T2D) or pre-diabetes as defined in the Metabolic Risk Factor section below. The control group was defined by: normal ALT (≤30 U/L for men, ≤20 U/L for women) and no apparent causes of liver disease. There was a 97% overlap between NAFLD cohorts defined by ALT-threshold and ALT-metabolic phenotypes. Given this high overlap, we chose “ALT-threshold” as the main NAFLD phenotype for our analyses given its simplicity and applicability in diverse study settings where clinical data may not be as detailed as in the VA CDW.
We examined the associations between known ALT-associated variants and maximum ALT within 2 years prior to enrollment as a continuous variable (labeled “ALT-max”). Sensitivity analyses were conducted with six additional NAFLD phenotypes as defined in S4 Table in S1 File in which we altered ALT thresholds, individual metabolic risk factors, and inclusion of intermediate ALT values in the control group (S5, S6 Tables in S1 File).
Metabolic risk factor definitions
All baseline variables were created using the most recent observation prior to MVP enrollment. BMI was obtained from vital signs taken during clinical appointments. DL was defined as any of the following: (i) triglyceride (TG) ≥ 150 mg/dL taken before 9 AM, (ii) high density lipoprotein (HDL) cholesterol < 40 mg/dL for men and < 50 mg/dL for women with at least 2 ICD-9-CM/10-CM codes (272.x/E78.0-E78.5), or (iii) at least one prescription for fenofibrate or gemfibrozil. The DL definition was based on the criteria established by Third Adult Treatment Panel (NCEP ATP III) for diagnosis of metabolic syndrome (MetS) [34]. Patients prescribed HMG-CoA reductase inhibitors who did not meet any other criteria were not classified as having DL as they could have been prescribed statins for primary coronary artery disease prevention unrelated to dyslipidemia [35]. Hypertension (HTN) was defined by ICD-9-CM/10-CM codes (401.x-405.x/I10-I16).
T2D was based on any of the following criteria: (i) ICD9-10 codes shown in S3 Table in S1 File, but excluding codes for diabetes mellitus (T1D), other diabetes, medical conditions that may cause diabetes, or diabetes pattern consistent with T1D (which included insulin in the absence of oral agents, age of onset <40 years, BMI<25, or history of diabetic ketoacidosis), (ii) hemoglobin A1c (HbA1c) ≥6.5% or outpatient blood glucose of ≥200 mg/dL, or (iii) at least two prescriptions for diabetic medications. Pre-diabetes was defined with ICD-9/ICD-10-CM codes: 790.2, 790.2x except 790.29, R73, R73.xx except R73.03 or HbA1c between 5.7% and 6.49%, ever before the enrollment date in the absence of diabetes.
Assessment of alcohol use
Alcohol consumption was assessed with the mean age-adjusted scores from the Alcohol Use Disorders Identification Test-Consumption (AUDIT-C), a validated 3-item questionnaire administered annually by VA primary care practitioners and used previously in MVP [36–38]. The rationale for including and adjusting for AUDIT-C was: i) diagnostic codes used to exclude patients for alcohol-use disorder may be insensitive for mild to moderate alcohol consumption, ii) one third of the sample met criteria for possible alcohol misuse by AUDIT-C resulting in loss of power if applying AUDIT-C as an exclusion criterion.
Genetic data
DNA extracted from whole blood was genotyped in MVP using a customized Affymetrix Axiom biobank array, the MVP 1.0 Genotyping Array, as previously described [31, 32]. Quality control procedures include the following as previously reported: 1) ancestry classification using a composite of self-reported race/ethnicity followed by ADMIXTURE v1.3 analyses; 2) exclusion of low-quality samples (individual missingness >2.5%), 3) exclusion of related samples (using KING software); and 4) exclusion of low quality variants (<95% call rate) [32]. Subsequently, genome-wide genotype pre-phasing (EAGLE v2) and imputation (Minimac3) was performed using the 1000 Genomes phase 3, version 5 reference population where variants with posterior call probability of < 0.9, imputation quality score <0.3, call rate <97.5%, and/or ancestry specific Hardy-Weinberg equilibrium P <1x10-20 were excluded. Variants were also excluded if they deviated >10% from their expected allele frequency from the 1000 Genomes Project. Ethnicity-specific principal component analysis was performed using EIGENSOFT software.
Genetic variants selected for analyses
As shown in S1 Table in S1 File, we initially tested 15 genetic variants representing 8 independent genomic regions from the imputed genetic dataset that were previously identified in genome-wide association studies [3–9], including those associated with ALT concentration [3, 7, 8] and/or NAFLD diagnosed by MR spectroscopy [9, 39], computed tomography (CT) [6], and histology [3, 40]. After this initial analysis (and lack of association at LYPLAL1), regional association plots were generated for all 8 previously reported NAFLD-associated loci using LocusZoom software [41] and shown in S1 Fig.
Electronic health record review
A medical record review in the VA EHR was independently performed by two hepatologists on a sample of national data of 457 MVP enrollees, that included 241 with liver biopsies and 216 that had at least one abdominal ultrasound, CT scan, or MRI to assess the diagnostic performance of the two ALT-based NAFLD phenotype definitions against biopsy-proven and/or radiologically confirmed NAFLD: (i) ABALT and (ii) ALT-threshold both defined above (NAFLD Phenotype Definitions). In addition to liver biopsy and imaging data, the adjudicators reviewed laboratory parameters, diagnoses, medication lists and inpatient and outpatient clinical notes to rule in or out NAFLD; the algorithm followed a previously published schema in the Veteran population [24]. The inter-rater reliability was measured by Cohen’s kappa (κ) statistic. Performance characteristics of two NAFLD phenotypes, ALT-threshold and ABALT, against EHR-adjudicated NAFLD as the gold standard were assessed by calculating positive predictive values (PPVs) using Stata 15 (StataCorp LP, College Station, TX).
Assessment of advanced liver disease
We investigated relationships between previously established NAFLD variants and advanced liver disease using two established clinically defined scores: FIB4 = Age [years] x AST [U/L] / (platelets [10^9/L] x sqrt (ALT)), and NAFLD fibrosis score = -1.675 + (0.037*age) + (0.094*BMI) + (1.13*(diabetes or prediabetes as defined above)) + (0.99*(AST/ALT))–(0.013*platelets)–(0.66*albumin) [26, 42–45]. We defined advanced liver disease phenotypes at enrollment by: (i) FIB4 score >2.670 [44] and (ii) NAFLD fibrosis score >0.676 with cutoffs based on their optimal performance characteristics in previous NAFLD studies [43]. Average platelet count at enrollment was investigated as a surrogate for portal hypertension as a continuous measure. We also analyzed FIB4 and NAFLD fibrosis scores as continuous measures (S7 Table in S1 File).
Statistical analyses
Regression models were used to delineate the presence and strength of the relationship between 8 established genetic loci and various definitions of the NAFLD phenotype (i.e. ABALT, ABALT2, ALT2DL, AL2DM, ALT2HTN, ALT2OBESE, FIB4score, NAFLD fibrosis score and their definition is described in S4 Table in S1 File). A total of 16 genetic variants were chosen to represent 8 independent genetic regions. In particular, the 15 previously reported variants (described in S1 Table in S1 File) were chosen together with an additional variant in LYPLA1 locus (rs3001032, chr1:219727779) that captured the lead association with NAFLD in the Million Veteran Program dataset upon investigating the regional association plot (S1 Fig). Linear regression was used for continuous outcomes, such as FIB4 score, NAFLD fibrosis score, whereas logistic regression was performed for dichotomous outcomes, e.g. ABALT, ABALT2, ALT2DL, AL2DM, ALT2HTN, ALT2OBESE. The primary analysis for the three above phenotypes was a trans-ethnic meta-analysis combining participants of EU, AA, and LA ancestry; this was also conducted separately for each ancestry (S5A–S5C Table in S1 File). The meta-analyses were performed using in a fixed-effects model using METAL with inverse-variance weighting of log odds ratios [46]. Between-study allelic effect size heterogeneity was assessed with Cochran’s Q statistic as implemented in METAL. Variants were considered genome-wide significant if they surpassed the standard threshold (P = 5x10-8). Additional replication-level significance (of P = 0.00625 representing Bonferroni correction of 8 independent loci) and experiment-wide significance (P = 1x10-5 for correction of ~5,000 independent tests regionally across the 8 loci) were also considered. Three multivariable models were generated for each outcome: (i) Model 1: NAFLD phenotype modeled as a function of SNP, age, gender, and the first 10 genetic principal components (PCs) of genetic ancestry, (ii) Model 2: NAFLD phenotype modeled as a function of SNP, age, gender, the first 10 genetic principal components, and alcohol consumption at enrollment, and (iii) Model 3: NAFLD phenotype modeled as a function of age, gender, the first 10 genetic principal components, alcohol consumption, T2D, hypertension, dyslipidemia and obesity. Covariates included age, gender, AUDIT-C score, and 10 PCs for genetic similarity. Analysis was performed using R version 3.2.5.
Results and discussion
Characteristics of NAFLD analytic cohort across diverse ancestries
As shown in Table 1, 192,616 participants in the final NAFLD analytic cohort included 148,354 (82%) Europeans (EU), 31,878 (18%) African-Americans (AA), and 12,384 (6.4%) Hispanic/Latinos (LA) with mean age of 64.5 (SD 13.1) of which 8.4% were female (similar to the proportion of females in the entire VA population). The proportion of females was higher among NAFLD cases across all ancestries.
[Figure omitted. See PDF.]
Table 1. Baseline characteristics of the MVP NAFLD analytic cohort defined by the ALT-threshold definition.
https://doi.org/10.1371/journal.pone.0237430.t001
The NAFLD analytic cohort had a substantial burden of cardiometabolic risk factors: 93% of participants had at least 1 metabolic risk factor, 50% had BMI ≥ 30 kg/m2, 71% had HTN, 26% had T2D, and 51% had DL. Approximately one third of the cohort showed evidence of alcohol misuse based on the AUDIT-C score [36] despite the exclusion of participants with alcohol use disorder diagnoses based on ICD-9-CM/10-CM. Laboratory measures consistent with advanced fibrosis were detected in 10.2% based on NAFLD fibrosis score (>0.676), 3.8% by FIB4 score (>2.670) and 9.5% based on platelet count (<150,000/μl), although fewer than 1% had diagnostic codes for cirrhosis or related complications(S2 Table in S1 File). As expected, participants with our primary NAFLD phenotype based on ALT-threshold were more likely to have concomitant metabolic risk factors compared to controls with greater obesity (57.3% vs 40.8%), HTN (81.6% vs 66.2%), T2D (35% vs 21.8%) and DL (67.7% vs 43%), but not alcohol misuse (30.7% vs 30.8%).
Similar patterns persisted across EU, AA and LA ancestries. However, alcohol misuse was more frequent among NAFLD compared to control participants with AA (26.7% vs 18.5%) and LA (30.2% vs 28.5%) but not EU (31.4% vs 32.4%) ancestries. These findings provide demographic and clinical characteristics of the NAFLD cohort in our analyses.
Replication of published NAFLD-associated loci in MVP NAFLD analytic cohort
We next sought to replicate the NAFLD risk associations previously reported for 7 SNPs in 6 distinct genetic loci including LYPLAL1, GCKR, HSD17B13, PPP1R3B, TM6SF2 and PNPLA3, using our primary and secondary NAFLD phenotype definitions (ALT-threshold and ALT-metabolic) with and without further adjustment for alcohol use and/or metabolic risk factors [3, 7, 8]. As shown in Table 2, four of the six NAFLD loci (5 of the seven tagging SNPs) were robustly associated in the trans-ethnic meta-analysis of MVP cohort across all phenotype definitions and models (all P < 1x10-6, S1 Fig). We observed negligible differences in effect estimates between the two NAFLD case definitions at these loci (Methods, Table 2), given the high overlap (97%) between ALT-threshold and ALT-metabolic as described in Methods. Additional adjustment for alcohol use based on AUDIT-C in Model 3 did not affect the estimated odds ratios.
[Figure omitted. See PDF.]
Table 2. Previously published NAFLD risk variants with genome-wide significant association with clinical NAFLD phenotypes across all ancestries in the Million Veteran Program NAFLD analytic cohort.
https://doi.org/10.1371/journal.pone.0237430.t002
We further investigated the two regions with little to no statistical association in our cohort in more detail. First, while the previously lead associated variant near the LYPLAL1 gene (rs12137855, chr1:219,448,378) was not associated in our cohort (all P >0.80, Table 2), a regional association plot (S1 Fig) indicated a substantial association with a robust effect with a nearby SNP (rs3001032, chr1:219,727,779) (all OR = 1.04, all P < 1x10-5). Previous studies have shown modest associations of rs3001032 with insulin resistance (HOMA-IR, P = 1.1x10-4), beta-cell function (HOMA-B, P = 6.6x10-4), BMI (P = 1.4x10-5), T2D (P = 3.8x10-14), HDL cholesterol (P = 8.1x10-3), and TG (P = 0.02), in contrast to rs12137855 which was not associated with these traits (P>0.05 for all) [47–50]. Given the burden of metabolic associations, these data suggest that rs3001032 is likely to tag a true NAFLD association in this region. Second, the previously associated variant at GCKR (rs780094) was not strongly associated with our NAFLD phenotypes (i.e., a nominal P < 0.05 in the base model), particularly after metabolic risk factor adjustments (Table 2). We investigated whether the association between GCKR (rs780094) and secondary NAFLD phenotypes was sensitive to the NAFLD subtype definition depending on the respective metabolic risk factor that served as an inclusion criterion. When the NAFLD phenotype was defined by ALT-threshold + dyslipidemia (ALT2DL, S6c Table in S1 File), the association was highly significant in participants of EU ancestry (OR 1.05, P = 6.5x10-8) as well as in the trans-ethnic meta-analysis (OR 1.05, P = 7 x 6.5x10-9). These associations persisted when the models accounted for alcohol consumption (Model 2), but were markedly attenuated and no longer significant when the NAFLD phenotype specifically excluded dyslipidemia and only included T2D (S6d Table in S1 File), HTN (S6e Table in S1 File), or obesity (S6f Table in S1 File) in its definition.
Comparison of established NAFLD loci across EU, AA and LA cohorts
We further explored the associations of the foregoing NAFLD risk variants between MVP participants stratified by EU, AA and LA ancestries (S5a and S5b Table in S1 File) [5–7, 9]. Similar to the trans-ethnic meta-analyses, 6 of the 8 NAFLD risk variants (including the revised LYPLAL1 variant rs3001032) were replicated with pre-specified threshold of significance (i.e., P<0.006) among EU participants with NAFLD defined by ALT-threshold (S5a Table in S1 File) or ALT-metabolic (S5b Table in S1 File) phenotype, but not GCKR (rs780094). Among AA participants, only the genetic variants in PPP1R3B (rs4240624) and PNPLA3 (rs738409) were replicated for both NAFLD phenotypes. Although there was a relatively modest sample of LA, in that population, there was 100% directional concordance in odds ratios for the risk alleles seen in EU participants and the TM6SF2 (rs58542926) and PNPLA3 (rs738409) loci were significantly associated with both NAFLD phenotypes.
Replication of genetic loci associated with elevated ALT in MVP NAFLD cohort
Having replicated NAFLD risk-associated variants with ALT-based NAFLD phenotypes, we further examined 10 variants reported to be associated with ALT levels [3, 7, 8] including two with NAFLD (rs72613567 and rs738409), using peak ALT (ALT-max) as defined in Methods. As shown in Table 3, all 10 variants were strongly associated with peak levels of ALT in the entire cohort, with the strongest associations for PNPLA3 variants. Significant associations persisted for all variants when adjusted for alcohol use in Model 2, while additional adjustment for metabolic risk factor in Model 3 further increased both effect size and statistical significance for most variants except for that in TRIB1. In ancestry-stratified analyses (S5c Table in S1 File), all 10 variants were replicated among EU. In the AA cohort, HSD17B13 (rs72613567) was replicated in Model 3 as was one SNP in ERLIN1 (rs11597086) and two in PNPLA3 (rs2281135, rs738409). In the LA cohort, variants at each of four independent loci were also replicated.
[Figure omitted. See PDF.]
Table 3. Previously published ALT-associated variants with genome-wide significance and association with maximal ALT at enrollment.
https://doi.org/10.1371/journal.pone.0237430.t003
Further sensitivity analyses were performed using previously published NAFLD risk and ALT-associated genetic loci with six alternative NAFLD phenotype definitions to determine whether further optimization could be achieved (S6a–S6f Table in S1 File). Altering the ALT cutoff to >30 U/L for men and >20 U/L for women, changing ALT cutoff for the control group, specifying the additional metabolic risk factor for NAFLD inclusion (e.g. T2D versus dyslipidemia, obesity, or hypertension), and altering the number of concomitant metabolic risk factors did not appreciably alter the associations, compared to NAFLD phenotype based on ALT-threshold.
Not surprisingly, the strength of associations improved for most NAFLD risk/ALT level-associated variants with higher ALT cutoffs (S6a, S6b Table in S1 File) and by further adjusting for metabolic risk factors for most variants. The stronger associations noted between established variants and higher ALT cutoffs shows the enhanced specificity (reduction in false positive cases) of the ALT-threshold phenotype without a concomitant reduction in statistical power to detect associations.
Clinical NAFLD phenotype performance and direct EHR review
We next performed an EHR review to assess the performance characteristics of our clinical ALT-based NAFLD phenotype definitions. The inter-rater reliability of the initial chart review was κ = 0.98. As shown in Table 4, the, ALT-threshold phenotype yielded PPV of 0.89 and 0.84 with biopsy and imaging as gold standards, respectively.
[Figure omitted. See PDF.]
Table 4. Electronic health record validation of NAFLD phenotype.
https://doi.org/10.1371/journal.pone.0237430.t004
Associations of established NAFLD risk and ALT level-associated variants with advanced fibrosis
Most NAFLD risk/ALT level-associated variants examined in our study have been associated with hepatic fibrosis progression including: GCKR (rs780094), HSD17B13 (rs72613567), TM6SF2 (rs58542926), ERLIN1 (rs11597390, rs11597086, rs11591741) and PNPLA3 (rs738409) [3–9, 51]. Therefore, we examined our NAFLD/ALT panel for associations with advanced fibrosis in our MVP cohort, using FIB4 score (>2.670) and NAFLD fibrosis score (≥0.676) and platelet counts at enrollment as a surrogate measure of portal hypertension. As shown in Table 5, variants in GCKR, HSD17B13 and PNPLA3 (but not TM6SF2 and ERLIN1) were associated with advanced fibrosis in our overall MVP cohort, but with variable levels of significance depending on fibrosis definition. For example, significant associations were replicated for the GCKR variant (rs780094), both HSD17B13 variants (rs6834314A, rs72613567T) and three PNPLA3 variants (rs738409, rs2281135, rs2143571) using platelet count as a continuous variable. However, the use of FIB4 score replicated the associations for HSD17B13 and PNPLA3 variants but not GCKR, whereas the use of NAFLD fibrosis score replicated the associations for PNPLA3 variants but not HSD17B13 or GCKR.
[Figure omitted. See PDF.]
Table 5. Previously published ALT level/NAFLD risk-associated variants with genome-wide significance and associations with advanced fibrosis/cirrhosis and platelet count at enrollment among patients with NAFLD (n = 60,542).
https://doi.org/10.1371/journal.pone.0237430.t005
Further ancestry-stratified analyses using FIB4 (S7a Table in S1 File), NAFLD fibrosis scores (S7b Table in S1 File) and baseline platelet count (S7c Table in S1 File) showed similar results for MVP participants with EU ancestry, with significant associations for GCKR, HSD17B13 and PNPLA3 variants. Despite smaller sample sizes, analyses using baseline platelet count showed significant associations among AA participants for GCKR and two PNPLA3 variants (rs738409, rs2281135) and among LA participants for HSD17B13 variant (rs6834314) and all three PNPLA3 variants. The use of NAFLD fibrosis score resulted in a significant association for the TRIB1 variant (rs2954021) among LA participants (S7b Table in S1 File), although this association did not persist when using NAFLD fibrosis or FIB4 as continuous measures (S7d and S7e Table in S1 File). Overall, results were similar for models adjusted for alcohol use and metabolic risk factors and with fibrosis scores as continuous measures (S7d and S7e Table in S1 File). Thus, these results replicated the associations between advanced hepatic fibrosis and GCKR, HSD17B13 and PNPLA3 variants in our MVP cohort. Together, these data demonstrate the utility of the ALT-threshold phenotype in phenotyping NAFLD in a large EHR database.
Discussion
In this study, we took advantage of the robust clinical EHR and genotype data from the largest and diverse NAFLD case/control cohort to date to develop a non-invasive ALT-based NAFLD phenotype that may be used in future, large-scale population-based studies. Our NAFLD phenotype is based on a few key components: chronically elevated ALT, exclusion of viral, cholestatic and other hereditary liver diseases, and exclusion of persons with alcohol-related cirrhosis.
Of the 322,259 potentially eligible MVP participants with genetic and clinical data, 19% met criteria for NAFLD as defined by the ALT-threshold phenotype. After applying exclusion criteria, of the 192,616 participants in the final NAFLD analytic cohort, 31% (n = 60,542) met criteria for NAFLD using this definition. These findings are consistent with the 18–21% NAFLD prevalence reported previously among Veterans (2003–2011) and with national estimates [23, 52, 53]. Expectedly, NAFLD participants were more likely to have metabolic risk factors than controls. In the course of developing our phenotype, we noted a high degree of overlap between the ALT-based NAFLD phenotype (ALT-threshold) and one that required a concomitant metabolic risk factor (ALT-metabolic). The very similar associations between known NAFLD risk genetic loci and these two definitions support our use of ALT-threshold as the primary NAFLD phenotype for two main reasons. The ALT-threshold definition is more parsimonious and by not including a metabolic risk factor facilitates the conduct of further genetic correlation or causal inference studies (via Mendelian randomization) to investigate the links between these individual metabolic risk factors and NAFLD (by not conditioning a phenotype on a metabolic risk factor performing causal inference studies of the influence of a risk factor and NAFLD would become problematic potentially inducing collider bias) [54]. In addition to investigating how our NAFLD phenotype associated with previously established genetic variants, we also assessed the performance characteristics of these phenotypes among Veterans with available liver biopsy and abdominal imaging data, which yielded high positive predictive values and high inter-rater reliability. The PPV noted in our study was 89% when compared to a biopsy-proven gold standard and 71% when using imaging and clinical notes as the gold standard. Results are comparable to other studies using EHR- and natural language-based processing algorithms [24, 29, 55].
The strength of our ALT-based NAFLD phenotype is that it utilizes factors routinely assessed in clinical practice and performs well even among participants with moderate alcohol consumption. Clinical models for the diagnosis of NAFLD/NASH have been validated in prospective studies, however, several require measures such as waist circumference, homeostasis model assessment of insulin resistance, or fasting insulin or fasting glucose. Several of these factors are not readily available in real-world settings [52, 53, 56].
In the course of performing genetic association studies, we made several observations regarding genetic variants in LYPLAL1 (rs12137855) and GCKR (rs80094). While the previously reported association was not replicated in the LYPLAL1 variant (rs12137855) in our cohort, a nearby variant (rs3001032) was strongly associated with our phenotype and a plethora of metabolic risk factors, suggesting that this variant tags the regional NAFLD signal. With regards to GCKR, our sensitivity analyses showed a highly significant association between the established GCKR variant and NAFLD only when dyslipidemia was included in the NAFLD definition. GKCR was previously found to be associated with elevated ALT, however this was in smaller, highly selected cohorts (overweight/obese Mexican women, obese children of Asian ancestry), which differed substantially from MVP enrollees [57, 58]. This was not surprising as GCKR was previously shown to enhance hepatic glucose uptake resulting in reduced fatty acid oxidation and increased hepatic de novo lipogenesis [59] augmenting both the risk of NAFLD and metabolic aberrations [6]. It has also been shown by others that the GCKR variant associates with dyslipidemia, while this is not the case for many other NAFLD risk-increasing genotypes such as PNPLA3 [6, 60] and that it increases the risk of NAFLD in obese individuals [58]. In sensitivity analyses, including/excluding dyslipidemia in the NAFLD case definition might have modified the proportions of individuals carrying these risk alleles contributing to the noted differences in the reported association tests. It is also possible that the lack of apparent associations with GCKR may have been due to our highly specific, but less sensitive NAFLD phenotype. This would need to be confirmed in future VA studies with imaging and biopsy data.
The diversity of the MVP cohort provided an opportunity to investigate NAFLD in under-represented populations. GWAS studies for NAFLD and ALT levels have largely focused on persons of EU ancestry, with minority populations underrepresented [4]. For example, only cohorts with EU ancestry were included in the two largest studies examining hepatic steatosis (n = 7,176) and ALT (n = 45,596), whereas other studies included up to 3,124 AA and 849 LA [5–7, 9, 61]. At the same time, NAFLD prevalence has been reported to be lower among AA but higher among LA than EU in population-based studies [23, 52, 62]. Notably, our MVP cohort of 60,542 NAFLD cases included 8,019 of AA and 5,870 of LA ancestry, thereby establishing one of the largest NAFLD cohorts with multi-ethnic representation. Among AA in our MVP cohort, significant associations with NAFLD and/or ALT were detected for 6 variants, including PNPLA3 (rs738409) and PPP1R3B (rs4240624), which were previously reported in 3,124 AA patients examined for hepatic steatosis by CT [61]. As for LA participants, significant associations were replicated for 9 variants including PNPLA3 (rs738409) further supporting the robustness of our NAFLD phenotype [61].
In MVP, we confirmed associations between several NAFLD risk variants and advanced fibrosis. In our main analyses, variants in PNPLA3 (rs738409, rs2281135, rs2143571) exhibited strong positive associations with advanced fibrosis and negative associations with platelet count and HSD17B13 variants (rs6834314, rs72613567) confirming the results of prior studies [3, 6, 39]. We did not find significant associations between advanced fibrosis and TM6SF2 [18] or two additional loci MBOAT7 and IFNL3/4 (results not shown) previously found to associate with hepatic steatosis and necroinflammation [20–22]. This may be secondary to our low sample size of patients with advanced fibrosis or the heterogeneity of fibrosis definitions across previous studies. The GCKR variant (rs780094) had a near-significant association with advanced fibrosis when characterized by continuous FIB4 measurement. Notably, GCKR was associated with a higher platelet count. This is not surprising as the variant in GCKR is pleiotropic and has been associated with platelet count and other human blood cell traits [63]. Interestingly, the observed prevalence of advanced fibrosis among AA was comparable to EU, differing from previous reports and suggesting possible under-recognition of NAFLD among AA in previous studies [61, 62] and/or an underestimation of how ethnic differences in pathogenic traits such as visceral adiposity underlie NAFLD susceptibility [64].
There are several limitations to this study. The requirement for abnormal ALT potentially excluded a large number of individuals with NAFLD/NASH with and without cirrhosis who did not manifest elevated liver enzymes. The primary analyses excluded those with intermediate ALT values, however, sensitivity analyses (S6 Table in S1 File) showed that genetic associations were similar when participants with intermediate values were included. Patients of Asian ancestry were not represented and women were under-represented potentially limiting generalizability. Although fibrosis was assessed non-invasively and in several different ways, the validity of these measures will need to be determined among Veterans. The sample size of Veterans with advanced fibrosis and biopsy or transient elastography data was small limiting our ability to evaluate associations with advanced fibrosis; these will be examined in future studies. We may have been limited in our ability to capture Veterans with the most severe forms of NAFLD who did not survive to MVP enrollment as well as Veterans with hepatic steatosis and normal ALT values. Despite these concerns, our accurate, genetically and clinically-validated phenotype should be amenable to large-scale scans to identify and replicate genetic causes of NAFLD and progression to complications.
Conclusion
We leveraged the clinical and genetic data in MVP—a multi-ethnic, mega-biobank to provide a validation of a simple, non-invasive ALT-based NAFLD phenotype in a real-world, population-based, national cohort. Our phenotype may be applied to future genetic and epidemiologic studies in population-based cohorts and to aid practicing clinicians in identifying individuals at risk for NAFLD with readily available clinical data.
Supporting information
[Figure omitted. See PDF.]
S1 File.
https://doi.org/10.1371/journal.pone.0237430.s001
(DOCX)
S1 Fig. Regional plots of 8 independent previously published NAFLD risk loci.
https://doi.org/10.1371/journal.pone.0237430.s002
(PDF)
Acknowledgments
This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration. This publication does not represent the views of the Department of Veterans Affairs or the United States Government.
Citation: Serper M, Vujkovic M, Kaplan DE, Carr RM, Lee KM, Shao Q, et al. (2020) Validating a non-invasive, ALT-based non-alcoholic fatty liver phenotype in the million veteran program. PLoS ONE 15(8): e0237430. https://doi.org/10.1371/journal.pone.0237430
1. Chalasani N, Younossi Z, Lavine JE, Charlton M, Cusi K, Rinella M, et al. The diagnosis and management of nonalcoholic fatty liver disease: Practice guidance from the American Association for the Study of Liver Diseases. Hepatology. 2018;67(1):328–57. pmid:28714183
2. Carr RM, Oranu A, Khungar V. Nonalcoholic Fatty Liver Disease: Pathophysiology and Management. Gastroenterol Clin North Am. 2016;45(4):639–52. pmid:27837778
3. Abul-Husn NS, Cheng X, Li AH, Xin Y, Schurmann C, Stevis P, et al. A Protein-Truncating HSD17B13 Variant and Protection from Chronic Liver Disease. New Engl J Med. 2018;378(12):1096–106. pmid:29562163
4. Kahali B, Halligan B, Speliotes EK. Insights from Genome-Wide Association Analyses of Nonalcoholic Fatty Liver Disease. Semin Liver Dis. 2015;35(4):375–91. pmid:26676813
5. Kozlitina J, Smagris E, Stender S, Nordestgaard BG, Zhou HH, Tybjaerg-Hansen A, et al. Exome-wide association study identifies a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease. Nat Genet. 2014;46(4):352–6. pmid:24531328
6. Speliotes EK, Yerges-Armstrong LM, Wu J, Hernaez R, Kim LJ, Palmer CD, et al. Genome-wide association analysis identifies variants associated with nonalcoholic fatty liver disease that have distinct effects on metabolic traits. PLoS Genet. 2011;7(3):e1001324. pmid:21423719
7. Chambers JC, Zhang W, Sehmi J, Li X, Wass MN, Van der Harst P, et al. Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat Genet. 2011;43(11):1131–8. pmid:22001757
8. Yuan X, Waterworth D, Perry JR, Lim N, Song K, Chambers JC, et al. Population-based genome-wide association studies reveal six loci influencing plasma levels of liver enzymes. Am J Hum Genet. 2008;83(4):520–8. pmid:18940312
9. Romeo S, Kozlitina J, Xing C, Pertsemlidis A, Cox D, Pennacchio LA, et al. Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat Genet. 2008;40(12):1461–5. pmid:18820647
10. Anstee QM, Seth D, Day CP. Genetic Factors That Affect Risk of Alcoholic and Nonalcoholic Fatty Liver Disease. Gastroenterology. 2016;150(8):1728–44 e7. pmid:26873399
11. Di Costanzo A, Belardinilli F, Bailetti D, Sponziello M, D'Erasmo L, Polimeni L, et al. Evaluation of Polygenic Determinants of Non-Alcoholic Fatty Liver Disease (NAFLD) By a Candidate Genes Resequencing Strategy. Sci Rep. 2018;8(1):3702. pmid:29487372
12. Pirola CJ, Flichman D, Dopazo H, Gianotti TF, San Martino J, Rohr C, et al. A Rare Nonsense Mutation in the Glucokinase Regulator Gene Is Associated With a Rapidly Progressive Clinical Form of Nonalcoholic Steatohepatitis. Hepatology Communications. 2018;2(9):1030–6. pmid:30202818
13. Bauer RC, Sasaki M, Cohen DM, Cui J, Smith MA, Yenilmez BO, et al. Tribbles-1 regulates hepatic lipogenesis through posttranscriptional regulation of C/EBPalpha. J Clin Invest. 2015;125(10):3809–18. pmid:26348894
14. Mehta MB, Shewale SV, Sequeira RN, Millar JS, Hand NJ, Rader DJ. Hepatic protein phosphatase 1 regulatory subunit 3B (Ppp1r3b) promotes hepatic glycogen synthesis and thereby regulates fasting energy homeostasis. J Biol Chem. 2017;292(25):10444–54. pmid:28473467
15. Smagris E, Gilyard S, BasuRay S, Cohen JC, Hobbs HH. Inactivation of Tm6sf2, a Gene Defective in Fatty Liver Disease, Impairs Lipidation but Not Secretion of Very Low Density Lipoproteins. J Biol Chem. 2016;291(20):10659–76. pmid:27013658
16. Linden D, Ahnmark A, Pingitore P, Ciociola E, Ahlstedt I, Andreasson AC, et al. Pnpla3 silencing with antisense oligonucleotides ameliorates nonalcoholic steatohepatitis and fibrosis in Pnpla3 I148M knock-in mice. Mol Metab. 2019;22:49–61. pmid:30772256
17. BasuRay S, Smagris E, Cohen JC, Hobbs HH. The PNPLA3 variant associated with fatty liver disease (I148M) accumulates on lipid droplets by evading ubiquitylation. Hepatology. 2017;66(4):1111–24. pmid:28520213
18. Liu YL, Reeves HL, Burt AD, Tiniakos D, McPherson S, Leathart JB, et al. TM6SF2 rs58542926 influences hepatic fibrosis progression in patients with non-alcoholic fatty liver disease. Nat Commun. 2014;5:4309. pmid:24978903
19. Pingitore P, Romeo S. The role of PNPLA3 in health and disease. Biochim Biophys Acta Mol Cell Biol Lipids. 2019;1864(6):900–6. pmid:29935383
20. Mancina RM, Dongiovanni P, Petta S, Pingitore P, Meroni M, Rametta R, et al. The MBOAT7-TMC4 Variant rs641738 Increases Risk of Nonalcoholic Fatty Liver Disease in Individuals of European Descent. Gastroenterology. 2016;150(5):1219–30 e6. pmid:26850495
21. Petta S, Valenti L, Tuttolomondo A, Dongiovanni P, Pipitone RM, Camma C, et al. Interferon lambda 4 rs368234815 TT>deltaG variant is associated with liver damage in patients with nonalcoholic fatty liver disease. Hepatology. 2017;66(6):1885–93. pmid:28741298
22. Luukkonen PK, Zhou Y, Hyotylainen T, Leivonen M, Arola J, Orho-Melander M, et al. The MBOAT7 variant rs641738 alters hepatic phosphatidylinositols and increases severity of non-alcoholic fatty liver disease in humans. J Hepatol. 2016;65(6):1263–5. pmid:27520876
23. Kanwal F, Kramer JR, Duan Z, Yu X, White D, El-Serag HB. Trends in the Burden of Nonalcoholic Fatty Liver Disease in a United States Cohort of Veterans. Clin Gastroenterol Hepatol. 2016;14(2):301–8 e1-2. pmid:26291667
24. Husain N, Blais P, Kramer J, Kowalkowski M, Richardson P, El-Serag HB, et al. Nonalcoholic fatty liver disease (NAFLD) in the Veterans Administration population: development and validation of an algorithm for NAFLD using automated data. Aliment Pharmacol Ther. 2014;40(8):949–54. pmid:25155259
25. Siddiqui MS, Harrison SA, Abdelmalek MF, Anstee QM, Bedossa P, Castera L, et al. Case definitions for inclusion and analysis of endpoints in clinical trials for nonalcoholic steatohepatitis through the lens of regulatory science. Hepatology. 2018;67(5):2001–12. pmid:29059456
26. Castera L, Friedrich-Rust M, Loomba R. Noninvasive Assessment of Liver Disease in Patients With Nonalcoholic Fatty Liver Disease. Gastroenterology. 2019;156(5):1264–81 e4. pmid:30660725
27. Middleton MS, Heba ER, Hooker CA, Bashir MR, Fowler KJ, Sandrasegaran K, et al. Agreement Between Magnetic Resonance Imaging Proton Density Fat Fraction Measurements and Pathologist-Assigned Steatosis Grades of Liver Biopsies From Adults With Nonalcoholic Steatohepatitis. Gastroenterology. 2017;153(3):753–61. pmid:28624576
28. Blais P, Husain N, Kramer JR, Kowalkowski M, El-Serag H, Kanwal F. Nonalcoholic fatty liver disease is underrecognized in the primary care setting. Am J Gastroenterol. 2015;110(1):10–4. pmid:24890441
29. Corey KE, Kartoun U, Zheng H, Shaw SY. Development and Validation of an Algorithm to Identify Nonalcoholic Fatty Liver Disease in the Electronic Medical Record. Dig Dis Sci. 2016;61(3):913–9. pmid:26537487
30. Fialoke S, Malarstig A, Miller MR, Dumitriu A. Application of Machine Learning Methods to Predict Non-Alcoholic Steatohepatitis (NASH) in Non-Alcoholic Fatty Liver (NAFL) Patients. AMIA Annu Symp Proc. 2018;2018:430–9. pmid:30815083
31. Gaziano JM, Concato J, Brophy M, Fiore L, Pyarajan S, Breeling J, et al. Million Veteran Program: A mega-biobank to study genetic influences on health and disease. J Clin Epidemiol. 2016;70:214–23. pmid:26441289
32. Klarin D, Damrauer SM, Cho K, Sun YV, Teslovich TM, Honerlaw J, et al. Genetics of blood lipids among ~300,000 multi-ethnic participants of the Million Veteran Program. Nat Genet. 2018;50(11):1514–23. pmid:30275531
33. Fang H, Hui Q, Lynch J, Honerlaw J, Assimes T, Huang J, Vujkovic M, Damrauer S, Pyarajan S, Gaziano M, DuVall S, O’Donnell C, Cho K, Chang KM, Wilson P, Tsao P, Sun Y, Tang H. Harmonizing genetic ancestry and self-identified race/ethnicity in genome-wide association studies. (AJHG, in press).
34. Alberti KG, Zimmet P, Shaw J. Metabolic syndrome—a new world-wide definition. A Consensus Statement from the International Diabetes Federation. Diabet Med. 2006;23(5):469–80. pmid:16681555
35. Bibbins-Domingo K, Grossman DC, Curry SJ, Davidson KW, Epling JW Jr., Garcia FA, et al. Statin Use for the Primary Prevention of Cardiovascular Disease in Adults: US Preventive Services Task Force Recommendation Statement. Jama. 2016;316(19):1997–2007. pmid:27838723
36. Bush K, Kivlahan DR, McDonell MB, Fihn SD, Bradley KA. The AUDIT alcohol consumption questions (AUDIT-C): an effective brief screening test for problem drinking. Ambulatory Care Quality Improvement Project (ACQUIP). Alcohol Use Disorders Identification Test. Arch Intern Med. 1998;158(16):1789–95. pmid:9738608
37. Justice AC, Smith RV, Tate JP, McGinnis K, Xu K, Becker WC, et al. AUDIT-C and ICD codes as phenotypes for harmful alcohol use: association with ADH1B polymorphisms in two US populations. Addiction. 2018;113(12):2214–24. pmid:29972609
38. Bradley KA, DeBenedetti AF, Volk RJ, Williams EC, Frank D, Kivlahan DR. AUDIT-C as a brief screen for alcohol misuse in primary care. Alcohol Clin Exp Res. 2007;31(7):1208–17. pmid:17451397
39. Kozlitina J, Stender S, Hobbs HH, Cohen JC. HSD17B13 and Chronic Liver Disease in Blacks and Hispanics. New Engl J Med. 2018;379(19):1876–7. pmid:30403941
40. Chalasani N, Guo X, Loomba R, Goodarzi MO, Haritunians T, Kwon S, et al. Genome-wide association study identifies variants associated with histologic features of nonalcoholic Fatty liver disease. Gastroenterology. 2010;139(5):1567–76, 76 e1-6. pmid:20708005
41. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics (Oxford, England). 2010;26(18):2336–7.
42. Angulo P, Hui JM, Marchesini G, Bugianesi E, George J, Farrell GC, et al. The NAFLD fibrosis score: a noninvasive system that identifies liver fibrosis in patients with NAFLD. Hepatology. 2007;45(4):846–54. pmid:17393509
43. Dowman JK, Tomlinson JW, Newsome PN. Systematic review: the diagnosis and staging of non-alcoholic fatty liver disease and non-alcoholic steatohepatitis. Alimentary Pharmacology & Therapeutics. 2011;33(5):525–40.
44. Shah AG, Lydecker A, Murray K, Tetri BN, Contos MJ, Sanyal AJ. Comparison of Noninvasive Markers of Fibrosis in Patients With Nonalcoholic Fatty Liver Disease. Clinical Gastroenterology and Hepatology. 2009;7(10):1104–12. pmid:19523535
45. Sterling RK, Lissen E, Clumeck N, Sola R, Correa MC, Montaner J, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology. 2006;43(6):1317–25. pmid:16729309
46. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics (Oxford, England). 2010;26(17):2190–1.
47. Dupuis J, Langenberg C, Prokopenko I, Saxena R, Soranzo N, Jackson AU, et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet. 2010;42(2):105–16. pmid:20081858
48. Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, et al. Meta-analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum Mol Genet. 2018;27(20):3641–9. pmid:30124842
49. Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet. 2018;50(11):1505–13. pmid:30297969
50. Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S, et al. Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013;45(11):1274–83. pmid:24097068
51. Rotman Y, Koh C, Zmuda JM, Kleiner DE, Liang TJ, Nash CRN. The association of genetic variability in patatin-like phospholipase domain-containing protein 3 (PNPLA3) with histological severity of nonalcoholic fatty liver disease. Hepatology. 2010;52(3):894–903. pmid:20684021
52. Ruhl CE, Everhart JE. Fatty liver indices in the multiethnic United States National Health and Nutrition Examination Survey. Aliment Pharmacol Ther. 2015;41(1):65–76. pmid:25376360
53. Bazick J, Donithan M, Neuschwander-Tetri BA, Kleiner D, Brunt EM, Wilson L, et al. Clinical Model for NASH and Advanced Fibrosis in Adult Patients With Diabetes and NAFLD: Guidelines for Referral in NAFLD. Diabetes Care. 2015;38(7):1347–55. pmid:25887357
54. Munafo MR, Tilling K, Taylor AE, Evans DM, Davey Smith G. Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol. 2018;47(1):226–35. pmid:29040562
55. Van Vleck TT, Chan L, Coca SG, Craven CK, Do R, Ellis SB, et al. Augmented intelligence with natural language processing applied to electronic health records for identifying patients with non-alcoholic fatty liver disease at risk for disease progression. International Journal of Medical Informatics. 2019;129:334–41. pmid:31445275
56. Bedogni G, Bellentani S, Miglioli L, Masutti F, Passalacqua M, Castiglione A, et al. The Fatty Liver Index: a simple and accurate predictor of hepatic steatosis in the general population. BMC Gastroenterol. 2006;6:33. pmid:17081293
57. Flores YN, Velazquez-Cruz R, Ramirez P, Banuelos M, Zhang ZF, Yee HF Jr., et al. Association between PNPLA3 (rs738409), LYPLAL1 (rs12137855), PPP1R3B (rs4240624), GCKR (rs780094), and elevated transaminase levels in overweight/obese Mexican adults. Mol Biol Rep. 2016;43(12):1359–69. pmid:27752939
58. Lin YC, Chang PF, Chang MH, Ni YH. Genetic variants in GCKR and PNPLA3 confer susceptibility to nonalcoholic fatty liver disease in obese individuals. Am J Clin Nutr. 2014;99(4):869–74. pmid:24477042
59. Beer NL, Tribble ND, McCulloch LJ, Roos C, Johnson PR, Orho-Melander M, et al. The P446L variant in GCKR associated with fasting plasma glucose and triglyceride levels exerts its effect through increased glucokinase activity in liver. Hum Mol Genet. 2009;18(21):4081–8. pmid:19643913
60. Sliz E, Sebert S, Wurtz P, Kangas AJ, Soininen P, Lehtimaki T, et al. NAFLD risk alleles in PNPLA3, TM6SF2, GCKR and LYPLAL1 show divergent metabolic effects. Hum Mol Genet. 2018;27(12):2214–23. pmid:29648650
61. Palmer ND, Musani SK, Yerges-Armstrong LM, Feitosa MF, Bielak LF, Hernaez R, et al. Characterization of European ancestry nonalcoholic fatty liver disease-associated variants in individuals of African and Hispanic descent. Hepatology. 2013;58(3):966–75. pmid:23564467
62. Saab S, Manne V, Nieto J, Schwimmer JB, Chalasani NP. Nonalcoholic Fatty Liver Disease in Latinos. Clin Gastroenterol Hepatol. 2016;14(1):5–12; quiz e9-0. pmid:25976180
63. Astle WJ, Elding H, Jiang T, Allen D, Ruklisa D, Mann AL, et al. The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease. Cell. 2016;167(5):1415–29 e19. pmid:27863252
64. Agbim U, Carr RM, Pickett-Blakely O, Dagogo-Jack S. Ethnic Disparities in Adiposity: Focus on Non-alcoholic Fatty Liver Disease, Visceral, and Generalized Obesity. Curr Obes Rep. 2019.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication: https://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Background & aims
Given ongoing challenges in non-invasive non-alcoholic liver disease (NAFLD) diagnosis, we sought to validate an ALT-based NAFLD phenotype using measures readily available in electronic health records (EHRs) and population-based studies by leveraging the clinical and genetic data in the Million Veteran Program (MVP), a multi-ethnic mega-biobank of US Veterans.
Methods
MVP participants with alanine aminotransferases (ALT) >40 units/L for men and >30 units/L for women without other causes of liver disease were compared to controls with normal ALT. Genetic variants spanning eight NAFLD risk or ALT-associated loci (LYPLAL1, GCKR, HSD17B13, TRIB1, PPP1R3B, ERLIN1, TM6SF2, PNPLA3) were tested for NAFLD associations with sensitivity analyses adjusting for metabolic risk factors and alcohol consumption. A manual EHR review assessed performance characteristics of the NAFLD phenotype with imaging and biopsy data as gold standards. Genetic associations with advanced fibrosis were explored using FIB4, NAFLD Fibrosis Score and platelet counts.
Results
Among 322,259 MVP participants, 19% met non-invasive criteria for NAFLD. Trans-ethnic meta-analysis replicated associations with previously reported genetic variants in all but LYPLAL1 and GCKR loci (P<6x10-3), without attenuation when adjusted for metabolic risk factors and alcohol consumption. At the previously reported LYPLAL1 locus, the established genetic variant did not appear to be associated with NAFLD, however the regional association plot showed a significant association with NAFLD 279kb downstream. In the EHR validation, the ALT-based NAFLD phenotype yielded a positive predictive value 0.89 and 0.84 for liver biopsy and abdominal imaging, respectively (inter-rater reliability (Cohen’s kappa = 0.98)). HSD17B13 and PNPLA3 loci were associated with advanced fibrosis.
Conclusions
We validate a simple, non-invasive ALT-based NAFLD phenotype using EHR data by leveraging previously established NAFLD risk-associated genetic polymorphisms.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer