Content area

Abstract

Fusion oncoproteins, a class of chimeric proteins arising from chromosomal translocations, are major drivers of various pediatric cancers. These proteins are intrinsically disordered and lack druggable pockets, making them highly challenging therapeutic targets for both small molecule-based and structure-based approaches. Protein language models (pLMs) have recently emerged as powerful tools for capturing physicochemical and functional protein features but have yet to be trained on fusion oncoprotein sequences. We introduce FusOn-pLM, a fine-tuned pLM trained on a newly curated, comprehensive set of fusion oncoprotein sequences, FusOn-DB. Employing a unique cosine-scheduled masked language modeling strategy, FusOn-pLM dynamically adjusts masking rates (15%–40%) to optimize feature extraction and representation quality, surpassing baseline embeddings in fusion-specific tasks, including localization, puncta formation, and disorder prediction. FusOn-pLM uniquely predicts drug-resistant mutations, providing insights for therapeutic design that anticipates resistance mechanisms. In total, FusOn-pLM provides biologically relevant representations for advancing therapeutic discovery in fusion-driven cancers.

Fusion oncoproteins drive paediatric cancers but are challenging to target due to their intrinsic disorder and lack of druggable pockets. Here, authors present FusOn-pLM, trained on FusOn-DB, which uses dynamic masking to outperform baselines in fusion-specific tasks and predict drug-resistant mutations, advancing therapeutic design.

Details

1009240
Title
FusOn-pLM: a fusion oncoprotein-specific language model via adjusted rate masking
Publication title
Volume
16
Issue
1
Pages
1436
Publication year
2025
Publication date
2025
Publisher
Nature Publishing Group
Place of publication
London
Country of publication
United States
Publication subject
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-02-07
Milestone dates
2025-01-30 (Registration); 2024-06-03 (Received); 2025-01-24 (Accepted)
Publication history
 
 
   First posting date
07 Feb 2025
ProQuest document ID
3164511662
Document URL
https://www.proquest.com/scholarly-journals/fuson-plm-fusion-oncoprotein-specific-language/docview/3164511662/se-2?accountid=208611
Copyright
Copyright Nature Publishing Group 2025
Last updated
2025-07-27
Database
ProQuest One Academic