Content area
Full text
Introduction
With the advent of high-throughput technologies and bioinformatics, researchers have integrated metabolomics with genome-wide association studies (GWAS), resulting in the MGWAS field1, 2, 3, 4, 5, 6, 7–8. The synthesis of data from MGWAS enables a multi-layered analysis of the genotype–phenotype relationship, revealing how single nucleotide variants throughout the genome can influence metabolic traits. Understanding the complex genetic networks that govern metabolic processes is essential to interpret how genetic predispositions can lead to fluctuations in metabolite levels that may indicate health or disease states. MGWAS represents the confluence of genetics and metabolomics, offering a powerful approach to reveal the genetic factors that shape the metabolome and their broader implications for human health and disease.
MGWAS became the standard for exploring the relationship between genetic variants and metabolite levels in biological samples, such as blood plasma. Despite their usefulness, these studies have inherent limitations9. The correlations they yielded were mostly statistical, indicating that they did not provide experimental biological validation. Consequently, these associations often raise questions about causality. This can lead to false-positive findings, where an association appears significant by chance and is not due to an objective biological relationship. Another limitation is that the small sample sizes used in these studies may have missed rare genetic variants, leading to false negatives and missing true associations. Experimental confirmation of the vast array of variant-metabolite combinations identified through MGWAS is daunting, presenting considerable challenges in interpreting and validating the results.
To overcome these limitations, we proposed the application of metabolic pathway model simulations for the analysis of MGWAS results10,11. These in silico experiments offer a comprehensive approach to investigate all possible variant-metabolite combinations, probing deeper into the metabolic network than is typically feasible in MGWAS. The essential advantage of this comprehensive approach is its ability to discern true associations from false positives by validating each variant-metabolite pair using simulated perturbations. By adjusting the enzyme reaction rates within the model to reflect specific genetic variations, the simulations could predict the resulting changes in metabolite concentrations. This thorough analysis not only supports the identification of true positives but also aids in confirming true negatives—cases where no actual association exists between a variant and metabolite, which MGWAS may incorrectly suggest as a potential...