Content area
Full Text
Drug–target interaction (DTI) prediction serves as an important step in the process of drug discovery1–3. Traditional biomedical measurement from in vitro experiments is reliable but has notably high cost and time-consuming development cycles, preventing its application to large-scale data4. By contrast, identifying high-confidence DTI pairs by in silico approaches can greatly narrow down the search scope of compound candidates, and provide insights into the causes of potential side effects in drug combinations. Therefore, in silico approaches have gained increasing attention and made much progress in the past few years5,6.
For in silico approaches, traditional structure-based and ligand-based virtual screening methods have been studied widely for their relatively effective performance7. However, structure-based virtual screening requires molecular docking simulation, which is not applicable if the target protein’s three-dimensional (3D) structure is unknown. Furthermore, ligand-based virtual screening predicts new active molecules based on the known actives of the same protein, but the performance is poor when the number of known actives is insufficient8.
More recently, deep learning-based approaches have rapidly progressed for computational DTI prediction due to their successes in other areas, enabling large-scale validation in a relatively short time9. Many of them are constructed from a chemogenomics perspective3,10, which integrates the chemical space, genomic space and interaction information into a unified end-to-end framework. As the number of biological targets that have available 3D structures is limited, many deep learning-based models take linear or two-dimensional (2D) structural information of drugs and proteins as inputs. They treat DTI prediction as a binary classification task, and make predictions by feeding the inputs into different deep encoding and decoding modules such as deep neural network (DNN)11,12, graph neural network (GNN)9,13–15 or transformer architectures16,17. With the advances of deep learning techniques, such models can automatically learn data-driven representations of drugs and proteins from large-scale DTI data instead of using only pre-defined descriptors.
Despite these promising developments, two challenges remain for existing deep learning-based methods. The first challenge is explicit learning of interactions between local structures of drug and protein. DTI is essentially decided by mutual effects between important molecular substructures in the drug compound and binding sites in the protein sequence18....
|
| |
|
|
|
| ||
|
|
|
|
|
|
|
|