Abstract

Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolution and mechanism of action studies, we develop an approach that represents gene signatures projected onto their biological functions, instead of their identities, similar to how the word2vec technique works in natural language processing. We develop the Functional Representation of Gene Signatures (FRoGS) approach by training a deep learning model and demonstrate that its application to the Broad Institute’s L1000 datasets results in more effective compound-target predictions than models based on gene identities alone. By integrating additional pharmacological activity data sources, FRoGS significantly increases the number of high-quality compound-target predictions relative to existing approaches, many of which are supported by in silico and/or experimental evidence. These results underscore the general utility of FRoGS in machine learning-based bioinformatics applications. Prediction networks pre-equipped with the knowledge of gene functions may help uncover new relationships among gene signatures acquired by large-scale OMICs studies on compounds, cell types, disease models, and patient cohorts.

Large-scale OMICs investigations of biological systems can be used to predict functional relationships between compounds, genes and proteins. Here, the authors develop a deep learning-based approach that significantly increases the number of high-quality compound-target predictions relative to existing methods.

Details

Title
Drug target prediction through deep learning functional representation of gene signatures
Author
Chen, Hao 1   VIAFID ORCID Logo  ; King, Frederick J. 2   VIAFID ORCID Logo  ; Zhou, Bin 2 ; Wang, Yu 2 ; Canedy, Carter J. 2 ; Hayashi, Joel 2 ; Zhong, Yang 2 ; Chang, Max W. 3   VIAFID ORCID Logo  ; Pache, Lars 4   VIAFID ORCID Logo  ; Wong, Julian L. 5 ; Jia, Yong 5 ; Joslin, John 5 ; Jiang, Tao 6   VIAFID ORCID Logo  ; Benner, Christopher 3   VIAFID ORCID Logo  ; Chanda, Sumit K. 7   VIAFID ORCID Logo  ; Zhou, Yingyao 8   VIAFID ORCID Logo 

 Novartis Biomedical Research, San Diego, USA; University of California, Riverside, Department of Computer Science and Engineering, Riverside, USA (GRID:grid.266097.c) (ISNI:0000 0001 2222 1582); Carnegie Mellon University, Computational Biology Department, School of Computer Science, Pittsburgh, USA (GRID:grid.147455.6) (ISNI:0000 0001 2097 0344) 
 Novartis Biomedical Research, San Diego, USA (GRID:grid.147455.6) 
 University of California, San Diego, Department of Medicine, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 Sanford Burnham Prebys Medical Discovery Institute, NCI Designated Cancer Center, La Jolla, USA (GRID:grid.479509.6) (ISNI:0000 0001 0163 8573) 
 Novartis Biomedical Research, San Diego, USA (GRID:grid.479509.6) 
 University of California, Riverside, Department of Computer Science and Engineering, Riverside, USA (GRID:grid.266097.c) (ISNI:0000 0001 2222 1582) 
 Scripps Research, Department of Immunology and Microbiology, La Jolla, USA (GRID:grid.214007.0) (ISNI:0000000122199231) 
 Novartis Biomedical Research, San Diego, USA (GRID:grid.214007.0) 
Pages
1853
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2933287041
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.