Abstract

Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.

Theodore Kalbfleisch et al. present an improved genome assembly for the domestic horse by combining short- and long-read data, as well as proximity ligation data. They improve contiguity of the assembly by 40-fold, with a 10-fold reduction in gaps.

Details

Title
Improved reference genome for the domestic horse increases assembly contiguity and composition
Author
Kalbfleisch, Theodore S 1   VIAFID ORCID Logo  ; Rice, Edward S 2 ; DePriest, Michael S, Jr 1 ; Walenz, Brian P 3 ; Hestand, Matthew S 4 ; Vermeesch, Joris R 4 ; O′Connell Brendan L 5 ; Fiddes, Ian T 6 ; Vershinina Alisa O 7 ; Saremi, Nedda F 2 ; Petersen, Jessica L 8 ; Finno, Carrie J 9 ; Bellone, Rebecca R 10 ; McCue, Molly E 11 ; Brooks, Samantha A 12 ; Bailey, Ernest 13 ; Orlando, Ludovic 14 ; Green, Richard E 2 ; Miller, Donald C 15 ; Antczak, Douglas F 15 ; MacLeod, James N 13   VIAFID ORCID Logo 

 University of Louisville, Department of Biochemistry and Molecular Genetics, School of Medicine, Louisville, USA (GRID:grid.266623.5) (ISNI:0000 0001 2113 1622) 
 UC Santa Cruz, Department of Biomolecular Engineering, Santa Cruz, USA (GRID:grid.205975.c) (ISNI:0000 0001 0740 6917) 
 National Institutes of Health, Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, USA (GRID:grid.94365.3d) (ISNI:0000 0001 2297 5165) 
 Katholieke University Leuven (KU Leuven), Center for Human Genetics, Leuven, Belgium (GRID:grid.5596.f) (ISNI:0000 0001 0668 7884) 
 UC Santa Cruz, Department of Biomolecular Engineering, Santa Cruz, USA (GRID:grid.205975.c) (ISNI:0000 0001 0740 6917); Oregon Health and Science University, Medical and Molecular Genetics, Portland, USA (GRID:grid.5288.7) (ISNI:0000 0000 9758 5690) 
 UC Santa Cruz, Department of Biomolecular Engineering, Santa Cruz, USA (GRID:grid.205975.c) (ISNI:0000 0001 0740 6917); 10x Genomics, Inc., Pleasanton, USA (GRID:grid.498512.3) 
 UC Santa Cruz, Department of Ecology and Evolutionary Biology, Santa Cruz, USA (GRID:grid.205975.c) (ISNI:0000 0001 0740 6917) 
 University of Nebraska – Lincoln, Department of Animal Science, Lincoln, USA (GRID:grid.24434.35) (ISNI:0000 0004 1937 0060) 
 University of California, Department of Population Health and Reproduction, Davis, USA (GRID:grid.27860.3b) (ISNI:0000 0004 1936 9684) 
10  University of California, Department of Population Health and Reproduction, Davis, USA (GRID:grid.27860.3b) (ISNI:0000 0004 1936 9684); University of California, Veterinary Genetics Laboratory, Davis, USA (GRID:grid.27860.3b) (ISNI:0000 0004 1936 9684) 
11  University of Minnesota, Department of Veterinary Population Medicine, St. Paul, USA (GRID:grid.17635.36) (ISNI:0000000419368657) 
12  University of Florida, UF Genetics Institute, Department of Animal Sciences, Gainesville, USA (GRID:grid.15276.37) (ISNI:0000 0004 1936 8091) 
13  University of Kentucky, Gluck Equine Research Center, Department of Veterinary Science, Lexington, USA (GRID:grid.266539.d) (ISNI:0000 0004 1936 8438) 
14  Natural History Museum of Denmark, Centre for GeoGenetics, Copenhagen, Denmark (GRID:grid.5254.6) (ISNI:0000 0001 0674 042X); Université Paul Sabatier, Laboratoire d’Anthropobiologie Moléculaire et d’Imagerie de Synthèse UMR 5288, Université de Toulouse, CNRS, Toulouse, France (GRID:grid.15781.3a) (ISNI:0000 0001 0723 035X) 
15  Cornell University, Baker Institute for Animal Health, College of Veterinary Medicine, Ithaca, USA (GRID:grid.5386.8) (ISNI:000000041936877X) 
Publication year
2018
Publication date
2018
Publisher
Nature Publishing Group
e-ISSN
23993642
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2389702910
Copyright
© The Author(s) 2018. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.