AnnoPRO: a strategy for protein function

Abstract

Protein function annotation has been one of the longstanding issues in biological sciences, and various computational methods have been developed. However, the existing methods suffer from a serious long-tail problem, with a large number of GO families containing few annotated proteins. Herein, an innovative strategy named AnnoPRO was therefore constructed by enabling sequence-based multi-scale protein representation, dual-path protein encoding using pre-training, and function annotation by long short-term memory-based decoding. A variety of case studies based on different benchmarks were conducted, which confirmed the superior performance of AnnoPRO among available methods. Source code and models have been made freely available at: https://github.com/idrblab/AnnoPRO and https://zenodo.org/records/10012272

Details

Title

AnnoPRO: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding

Author

Zheng, Lingyan; Shi, Shuiyang; Lu, Mingkun; Pan, Fang; Pan, Ziqi; Zhang, Hongning; Zhou, Zhimeng; Zhang, Hanyu; Mou, Minjie; Huang, Shijie; Lin, Tao; Xia, Weiqi; Li, Honglin; Zeng, Zhenyu; Zhang, Shun; Chen, Yuzong

Pages

1-22

Section

Method

Publication year

2024

Publication date

2024

Publisher

Springer Nature B.V.

ISSN

14747596

e-ISSN

1474760X

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s13059-024-03166-1

ProQuest document ID

2925640474

© 2024. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

AnnoPRO: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding

Jump to:

Abstract

Details

Full text options

Suggested sources