Content area

Abstract

Phenotype-driven approaches identify disease-counteracting compounds by analyzing the phenotypic signatures that distinguish diseased from healthy states. These approaches can guide the discovery of targeted perturbations, including small-molecule drugs and genetic interventions, that modulate disease phenotypes toward healthier states. Here, we introduce PDGrapher, a causally inspired graph neural network (GNN) designed to predict combinatorial perturbagens (sets of therapeutic targets) capable of reversing disease phenotypes. Unlike methods that learn how perturbations alter phenotypes, PDGrapher solves the inverse problem of directly predicting the perturbagens needed to achieve a desired response. PDGrapher is a GNN that embeds disease cell states into gene regulatory or protein-protein interaction networks, learns a latent representation of these states, and identifies the optimal combinatorial perturbations that most effectively shift the diseased state toward the desired treated state within that latent space. In experiments in nine cell lines with chemical perturbations, PDGrapher identified effective perturbagens in up to 13.33% more test samples than competing methods and achieved a normalized discounted cumulative gain of up to 0.12 higher to classify therapeutic targets. It also demonstrated competitive performance on ten genetic perturbation datasets. A key advantage of PDGrapher is its direct prediction paradigm, in contrast to the indirect and computationally intensive models traditionally employed in phenotype-driven research. This approach accelerates training by up to 25 times compared to existing methods. PDGrapher provides a fast approach for identifying therapeutic perturbations and advancing phenotype-driven drug discovery.

Competing Interest Statement

G.G. is currently employed by Genentech, Inc. and F. Hoffmann-La Roche Ltd. I.H. is currently employed by Merck & Co., Inc.

Footnotes

* We expanded our dataset evaluations and benchmarking by incorporating additional baselines, such as the "Perturbed genes" baseline, to ensure a more realistic comparison with existing models. The ranking metric was replaced with normalized discounted cumulative gain (nDCG) to improve robustness and comparability across datasets. Sensitivity analyses were conducted using protein-protein interaction (PPI) networks, testing PDGrapher under various edge removal strategies to validate its resilience against incomplete network structures. We further assessed performance on synthetic datasets with latent confounder bias, demonstrating PDGrapher's stability across different noise levels. To enhance clarity, we expanded the methodology section, detailing cell line selection criteria, cross-validation procedures, and latent confounder simulations. Finally, we introduced two new biological case studies, evaluating predicted therapeutic targets using Open Targets, confirming PDGrapher's superior performance in identifying relevant targets compared to ablated models and baselines.

* https://zitniklab.hms.harvard.edu/projects/PDGrapher

* https://github.com/mims-harvard/PDGrapher

Details

1009240
Title
Combinatorial prediction of therapeutic perturbations using causally-inspired neural networks
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2025
Publication date
Jan 28, 2025
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Milestone dates
2024-01-03 (Version 1); 2024-01-08 (Version 2); 2024-10-07 (Version 3)
ProQuest document ID
2909347408
Document URL
https://www.proquest.com/working-papers/combinatorial-prediction-therapeutic/docview/2909347408/se-2?accountid=208611
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-01-29
Database
2 databases
  • Coronavirus Research Database
  • ProQuest One Academic