Abstract

Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we identify genomic features of SV classes and STRs that are associated with gene expression and complex traits, including their locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We identify a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and show that they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that are associated with gene expression and human traits.

Genetic variation associated with gene expression changes has mostly been studied in the context of single nucleotide variants. Here, Jakubosky et al. report eQTL analysis of structural variants and short tandem repeats and find properties, such as length of variation, that affect the association.

Details

Title
Properties of structural variants and short tandem repeats associated with gene expression and complex traits
Author
Jakubosky, David 1 ; D’Antonio Matteo 2   VIAFID ORCID Logo  ; Bonder, Marc Jan 3 ; Smail, Craig 4 ; Donovan, Margaret K, R 5 ; Young Greenwald William W 6 ; Matsui Hiroko 2 ; Bonder, Marc J 3 ; Cai Na 7 ; Carcamo-Orive Ivan 8 ; Frazer, Kelly A 9 ; Young Greenwald William W 6 ; Knowles, Joshua W 8 ; McCarthy, Davis J 10 ; Mirauta, Bogdan A 11 ; Montgomery, Stephen B 12 ; Quertermous, Thomas 8 ; Seaton, Daniel D 11 ; Smith, Erin N 13 ; Stegle Oliver 14 ; D’Antonio-Chronowska Agnieszka 2 ; DeBoever Christopher 2   VIAFID ORCID Logo 

 University of California San Diego, Biomedical Sciences Graduate Program, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California San Diego, Department of Biomedical Informatics, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 University of California San Diego, Institute of Genomic Medicine, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 European Bioinformatics Institute, Hinxton, European Molecular Biology Laboratory, Cambridge, UK (GRID:grid.225360.0) (ISNI:0000 0000 9709 7726); European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany (GRID:grid.4709.a) (ISNI:0000 0004 0495 846X) 
 Stanford University School of Medicine, Department of Biomedical Data Science, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956); Stanford University, Department of Pathology, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956) 
 University of California San Diego, Department of Biomedical Informatics, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California San Diego, Bioinformatics and Systems Biology Graduate Program, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 University of California San Diego, Bioinformatics and Systems Biology Graduate Program, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 European Bioinformatics Institute, Hinxton, European Molecular Biology Laboratory, Cambridge, UK (GRID:grid.225360.0) (ISNI:0000 0000 9709 7726); Wellcome Sanger Institute, Cambridge, UK (GRID:grid.10306.34) (ISNI:0000 0004 0606 5382) 
 Stanford University School of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956) 
 University of California San Diego, Institute of Genomic Medicine, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California San Diego, Department of Pediatrics, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
10  European Bioinformatics Institute, Hinxton, European Molecular Biology Laboratory, Cambridge, UK (GRID:grid.225360.0) (ISNI:0000 0000 9709 7726); St Vincent’s Institute of Medical Research, Fitzroy, Victoria, Australia (GRID:grid.1073.5) (ISNI:0000 0004 0626 201X) 
11  European Bioinformatics Institute, Hinxton, European Molecular Biology Laboratory, Cambridge, UK (GRID:grid.225360.0) (ISNI:0000 0000 9709 7726) 
12  Stanford University, Department of Pathology, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956); Stanford University, Department of Genetics, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956) 
13  University of California San Diego, Department of Pediatrics, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
14  European Bioinformatics Institute, Hinxton, European Molecular Biology Laboratory, Cambridge, UK (GRID:grid.225360.0) (ISNI:0000 0000 9709 7726); European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany (GRID:grid.4709.a) (ISNI:0000 0004 0495 846X); German Cancer Research Center, Division of Computational Genomics and Systems Genetics, Heidelberg, Germany (GRID:grid.7497.d) (ISNI:0000 0004 0492 0584) 
Publication year
2020
Publication date
2020
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2412149337
Copyright
© The Author(s) 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.