Full Text

Turn on search term navigation

© 2022 Singer et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model—despite weaknesses including a noisy data set—can be used to substantially increase the stability of both expert-designed and model-generated proteins.

Details

Title
Large-scale design and refinement of stable proteins using sequence-only models
Author
Singer, Jedediah M; Novotney, Scott; Strickland, Devin; Haddox, Hugh K; Leiby, Nicholas; Rocklin, Gabriel J; Chow, Cameron M; Roy, Anindya; Bera, Asim K; Motta, Francis C; Cao, Longxing; Strauch, Eva-Maria; Chidyausiku, Tamuka M; Ford, Alex; Ho, Ethan; Zaitzeff, Alexander; Mackenzie, Craig O; Eramian, Hamed; DiMaio, Frank; Grigoryan, Gevorg; Vaughn, Matthew; Stewart, Lance J; Baker, David; Klavins, Eric
First page
e0265020
Section
Research Article
Publication year
2022
Publication date
Mar 2022
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2638936796
Copyright
© 2022 Singer et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.