Ensembling methods for protein-ligand binding

Abstract

Protein-ligand binding affinity prediction is a key element of computer-aided drug discovery. Most of the existing deep learning methods for protein-ligand binding affinity prediction utilize single models and suffer from low accuracy and generalization capability. In this paper, we train 13 deep learning models from combinations of 5 input features. Then, we explore all possible ensembles of the trained models to find the best ensembles. Our deep learning models use cross-attention and self-attention layers to extract short and long-range interactions. Our method is named Ensemble Binding Affinity (EBA). EBA extracts information from various models using different combinations of input features, such as simple 1D sequential and structural features of the protein-ligand complexes rather than 3D complex features. EBA is implemented to accurately predict the binding affinity of a protein-ligand complex. One of our ensembles achieves the highest Pearson correlation coefficient (R) value of 0.914 and the lowest root mean square error (RMSE) value of 0.957 on the well-known benchmark test set CASF2016. Our ensembles show significant improvements of more than 15% in R-value and 19% in RMSE on both well-known benchmark CSAR-HiQ test sets over the second-best predictor named CAPLA. Furthermore, the superior performance of the ensembles across all metrics compared to existing state-of-the-art protein-ligand binding affinity prediction methods on all five benchmark test datasets demonstrates the effectiveness and robustness of our approach. Therefore, our approach to improving binding affinity prediction between proteins and ligands can contribute to improving the success rate of potential drugs and accelerate the drug development process.

Details

Title

Ensembling methods for protein-ligand binding affinity prediction

Author

Mohamed Abdul Cader, Jiffriya¹; Newton, M. A. Hakim²; Rahman, Julia³; Mohamed Abdul Cader, Akmal Jahan⁴; Sattar, Abdul⁵

¹ Griffith University, School of Information and Communication Technology, Nathan Campus, Australia (GRID:grid.1022.1) (ISNI:0000 0004 0437 5432); Sri Lanka Institute of Advanced Technological Education, Department of IT, Colombo, Sri Lanka (GRID:grid.1022.1)
² Griffith University, Institute for Integrated and Intelligent Systems (IIIS), Nathan Campus, Australia (GRID:grid.1022.1) (ISNI:0000 0004 0437 5432); The University of Newcastle, School of Information and Physical Sciences, Callaghan, Australia (GRID:grid.266842.c) (ISNI:0000 0000 8831 109X)
³ Griffith University, School of Information and Communication Technology, Nathan Campus, Australia (GRID:grid.1022.1) (ISNI:0000 0004 0437 5432); Rajshahi University of Engineering and Technology, Department of Computer Science and Engineering, Rajshahi, Bangladesh (GRID:grid.443086.d) (ISNI:0000 0004 1755 355X)
⁴ South Eastern University of Sri Lanka, Department of Computer Science, Faculty of Applied Sciences, Sammanthurai, Sri Lanka (GRID:grid.443394.d) (ISNI:0000 0004 0453 0316)
⁵ Griffith University, School of Information and Communication Technology, Nathan Campus, Australia (GRID:grid.1022.1) (ISNI:0000 0004 0437 5432); Griffith University, Institute for Integrated and Intelligent Systems (IIIS), Nathan Campus, Australia (GRID:grid.1022.1) (ISNI:0000 0004 0437 5432)

Pages

24447

Publication year

2024

Publication date

2024

Publisher

Nature Publishing Group

e-ISSN

20452322

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1038/s41598-024-72784-3

ProQuest document ID

3118122318

© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Ensembling methods for protein-ligand binding affinity prediction

Jump to:

Abstract

Details

Suggested sources