Abstract

Angiogenesis is a key process for the proliferation and metastatic spread of cancer cells. Anti-angiogenic peptides (AAPs), with the capability of inhibiting angiogenesis, are promising candidates in cancer treatment. We propose AAPL, a sequence-based predictor to identify AAPs with machine learning models of improved prediction accuracy. Each peptide sequence was transformed to a vector of 4335 numeric values according to 58 different feature types, followed by a heuristic algorithm for feature selection. Next, the hyperparameters of six machine learning models were optimized with respect to the feature subset. We considered two datasets, one with entire peptide sequences and the other with 15 amino acids from peptide N-termini. AAPL achieved Matthew’s correlation coefficients of 0.671 and 0.756 for independent tests based on the two datasets, respectively, outperforming existing predictors by a range of 5.3% to 24.6%. Further analyses show that AAPL yields higher prediction accuracy for peptides with more hydrophobic residues, and fewer hydrophilic and charged residues. The source code of AAPL is available at https://github.com/yunzheng2002/Anti-angiogenic.

Details

Title
Improved prediction of anti-angiogenic peptides based on machine learning models and comprehensive features from peptide sequences
Author
Lee, Yun-Chen 1 ; Yu, Jen-Chieh 2 ; Ni, Kuan 3 ; Lin, Yu-Chuan 2 ; Chen, Ching-Tai 4 

 Asia University, Department of Computer Science and Information Engineering, Taichung, Taiwan (GRID:grid.252470.6) (ISNI:0000 0000 9263 9645) 
 Asia University, Department of Bioinformatics and Medical Engineering, Taichung, Taiwan (GRID:grid.252470.6) (ISNI:0000 0000 9263 9645) 
 National Chung Hsing University, Graduate Institute of Genomics and Bioinformatics, Taichung, Taiwan (GRID:grid.260542.7) (ISNI:0000 0004 0532 3749) 
 Asia University, Department of Bioinformatics and Medical Engineering, Taichung, Taiwan (GRID:grid.252470.6) (ISNI:0000 0000 9263 9645); Asia University, Center for Precision Health Research, Taichung, Taiwan (GRID:grid.252470.6) (ISNI:0000 0000 9263 9645) 
Pages
14387
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3071129414
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.