Abstract

The complete human genome sequence is used as a reference for next-generation sequencing analyses. However, some ethnic ancestries are under-represented in the reference genome (e.g., GRCh37) due to its bias toward European and African ancestries. Here, we perform de novo assembly of three Japanese male genomes using > 100× Pacific Biosciences long reads and Bionano Genomics optical maps per sample. We integrate the genomes using the major allele for consensus and anchor the scaffolds using genetic and radiation hybrid maps to reconstruct each chromosome. The resulting genome sequence, JG1, is contiguous, accurate, and carries the Japanese major allele at most loci. We adopt JG1 as the reference for confirmatory exome re-analyses of seven rare-disease Japanese families and find that re-analysis using JG1 reduces total candidate variant calls versus GRCh37 while retaining disease-causing variants. These results suggest that integrating multiple genomes from a single population can aid genome analyses of that population.

Human reference genomes are typically constructed from few individuals, and are biased towards European and African genomes. Here, the authors assemble three Japanese genomes to create a population-specific reference genome. They then demonstrate improved variant calling from exome sequencing with this reference genome.

Details

Title
Construction and integration of three de novo Japanese human genome assemblies toward a population-specific reference
Author
Takayama, Jun 1   VIAFID ORCID Logo  ; Tadaka Shu 2   VIAFID ORCID Logo  ; Yano Kenji 3   VIAFID ORCID Logo  ; Katsuoka Fumiki 4   VIAFID ORCID Logo  ; Gocho Chinatsu 2 ; Funayama Takamitsu 2 ; Makino Satoshi 2 ; Okamura Yasunobu 4 ; Kikuchi Atsuo 5   VIAFID ORCID Logo  ; Sugimoto Sachiyo 2 ; Kawashima Junko 2 ; Otsuki Akihito 2   VIAFID ORCID Logo  ; Sakurai-Yageta Mika 2 ; Yasuda, Jun 6 ; Kure Shigeo 7 ; Kinoshita Kengo 8   VIAFID ORCID Logo  ; Yamamoto Masayuki 4   VIAFID ORCID Logo  ; Tamiya Gen 9 

 Tohoku University, Advanced Research Center for Innovations in Next-Generation Medicine, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Statistical Genetics Team, RIKEN Center for Advanced Intelligence Project, Nihonbashi 1-chome Mitsui Building 15F, Chuo-ku, Japan (GRID:grid.7597.c) (ISNI:0000000094465255) 
 Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
 Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Statistical Genetics Team, RIKEN Center for Advanced Intelligence Project, Nihonbashi 1-chome Mitsui Building 15F, Chuo-ku, Japan (GRID:grid.7597.c) (ISNI:0000000094465255) 
 Tohoku University, Advanced Research Center for Innovations in Next-Generation Medicine, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
 Tohoku University Graduate School of Medicine, Department of Pediatrics, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
 Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Miyagi Cancer Center Research Institute, Division of Molecular and Cellular Oncology, Natori, Japan (GRID:grid.419939.f) (ISNI:0000 0004 5899 0430) 
 Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University Graduate School of Medicine, Department of Pediatrics, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
 Tohoku University, Advanced Research Center for Innovations in Next-Generation Medicine, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University, Graduate School of Information Sciences, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
 Tohoku University, Advanced Research Center for Innovations in Next-Generation Medicine, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Tohoku University, Tohoku Medical Megabank Organization, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943); Statistical Genetics Team, RIKEN Center for Advanced Intelligence Project, Nihonbashi 1-chome Mitsui Building 15F, Chuo-ku, Japan (GRID:grid.7597.c) (ISNI:0000000094465255); Tohoku University Graduate School of Medicine, Sendai, Japan (GRID:grid.69566.3a) (ISNI:0000 0001 2248 6943) 
Publication year
2021
Publication date
2021
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2476743548
Copyright
© The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.