Abstract

Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and β globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.

Energetic local frustration in proteins may have been positively selected by evolution when related to function such as ligand binding, allostery and other. Here the authors present a methodology to analyze local frustration patterns within protein families and superfamilies.

Details

Title
Local energetic frustration conservation in protein families and superfamilies
Author
Freiberger, Maria I. 1   VIAFID ORCID Logo  ; Ruiz-Serra, Victoria 2   VIAFID ORCID Logo  ; Pontes, Camila 2   VIAFID ORCID Logo  ; Romero-Durana, Miguel 2   VIAFID ORCID Logo  ; Galaz-Davison, Pablo 3 ; Ramírez-Sarmiento, Cesar A. 3   VIAFID ORCID Logo  ; Schuster, Claudio D. 4   VIAFID ORCID Logo  ; Marti, Marcelo A. 4   VIAFID ORCID Logo  ; Wolynes, Peter G. 5   VIAFID ORCID Logo  ; Ferreiro, Diego U. 1 ; Parra, R. Gonzalo 2   VIAFID ORCID Logo  ; Valencia, Alfonso 6   VIAFID ORCID Logo 

 Universidad de Buenos Aires, Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica – IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Buenos Aires, Argentina (GRID:grid.7345.5) (ISNI:0000 0001 0056 1981) 
 Barcelona Supercomputing Center, Computational Biology Group, Life Sciences Department, Barcelona, Spain (GRID:grid.10097.3f) (ISNI:0000 0004 0387 1602) 
 Pontificia Universidad Católica de Chile, Institute for Biological and Medical Engineering, Schools of Engineering, Medicine, and Biological Sciences, Santiago, Chile (GRID:grid.7870.8) (ISNI:0000 0001 2157 0406); ANID - Millennium Science Initiative Program – Millennium Institute for Integrative Biology (iBio), Santiago, Chile (GRID:grid.511281.e) 
 Universidad de Buenos Aires, Laboratorio de Bioinformática, Departamento de Química Biológica – IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Buenos Aires, Argentina (GRID:grid.7345.5) (ISNI:0000 0001 0056 1981) 
 Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, USA (GRID:grid.21940.3e) (ISNI:0000 0004 1936 8278) 
 Barcelona Supercomputing Center, Computational Biology Group, Life Sciences Department, Barcelona, Spain (GRID:grid.10097.3f) (ISNI:0000 0004 0387 1602); Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain (GRID:grid.425902.8) (ISNI:0000 0000 9601 989X) 
Pages
8379
Publication year
2023
Publication date
2023
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2902580934
Copyright
© The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.