Abstract

Large pretrained protein language models (PLMs) have improved protein property and structure prediction from sequences via transfer learning, in which weights and representations from PLMs are repurposed for downstream tasks. Although PLMs have shown great promise, currently there is little understanding of how the features learned by pretraining relate to and are useful for downstream tasks. We perform a systematic analysis of transfer learning using PLMs, conducting 370 experiments across a comprehensive suite of factors including different downstream tasks, architectures, model sizes, model depths, and pretraining time. We observe that while almost all downstream tasks do benefit from pretrained models compared to naive sequence representations, for the majority of tasks performance does not scale with pretraining, and instead relies on low-level features learned early in pretraining. Our results point to a mismatch between current PLM pretraining paradigms and most applications of these models, indicating a need for better pretraining methods.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* Revised reference to ProteinBERT

Details

Title
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
Author
Li, Francesca-Zhoufan; Amini, Ava Pardis; Yue, Yisong; Yang, Kevin K; Lu, Alex X
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2024
Publication date
Feb 14, 2024
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2923516589
Copyright
© 2024. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.