Abstract

Characterizing the effect of mutations is key to understand the evolution of protein sequences and to separate neutral amino-acid changes from deleterious ones. Epistatic interactions between residues can lead to a context dependence of mutation effects. Context dependence constrains the amino-acid changes that can contribute to polymorphism in the short term, and the ones that can accumulate between species in the long term. We use computational approaches to accurately predict the polymorphisms segregating in a panel of 61,157 Escherichia coli genomes from the analysis of distant homologues. By comparing a context-aware Direct-Coupling Analysis modelling to a non-epistatic approach, we show that the genetic context strongly constrains the tolerable amino acids in 30% to 50% of amino-acid sites. The study of more distant species suggests the gradual build-up of genetic context over long evolutionary timescales by the accumulation of small epistatic contributions.

Predicting the effects of mutations in a species is a major challenge in genetics. Here, the authors investigate protein sequence landscapes using diverged E. coli sequences, to predict tolerated mutations and capture interactions between mutations.

Details

Title
Deciphering polymorphism in 61,157 Escherichia coli genomes via epistatic sequence landscapes
Author
Vigué, Lucile 1   VIAFID ORCID Logo  ; Croce, Giancarlo 2 ; Petitjean, Marie 1   VIAFID ORCID Logo  ; Ruppé, Etienne 3 ; Tenaillon, Olivier 1   VIAFID ORCID Logo  ; Weigt, Martin 4   VIAFID ORCID Logo 

 Université Paris Cité and Université Sorbonne Paris Nord, Inserm, IAME, Paris, France (GRID:grid.508487.6) (ISNI:0000 0004 7885 7602) 
 University of Lausanne, Department of Oncology, Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland (GRID:grid.9851.5) (ISNI:0000 0001 2165 4204); Swiss Institute of Bioinformatics—SIB, Lausanne, Switzerland (GRID:grid.419765.8) (ISNI:0000 0001 2223 3006) 
 Université Paris Cité and Université Sorbonne Paris Nord, Inserm, IAME, Paris, France (GRID:grid.508487.6) (ISNI:0000 0004 7885 7602); Hôpital Bichat, APHP, Laboratoire de Bactériologie, Paris, France (GRID:grid.411119.d) (ISNI:0000 0000 8588 831X) 
 Computational and Quantitative Biology—LCQB, Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Paris, France (GRID:grid.503320.7) (ISNI:0000 0004 0459 3739) 
Publication year
2022
Publication date
2022
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2688286890
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.