It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
RNAs constitute a vast reservoir of mostly untapped rug targets. Structure-based virtual screening (VS) methods screen large compound libraries for identifying promising candidate molecules by conditioning on binding site information. The classical approach relies on molecular docking simulations. However, this strategy does not scale well with the size of the small molecule databases and the number of potential RNA targets. Machine learning emerged as a promising technology to resolve this bottleneck. Efficient data-driven VS methods have already been introduced for proteins, but these techniques have not yet been developed for RNAs due to limited dataset sizes and lack of practical use-case evaluation. We propose a data-driven VS pipeline that deals with the unique challenges of RNA molecules through coarse grained modeling of 3D structures and heterogeneous training regimes using synthetic data augmentation and RNA-centric self supervision. We report strong prediction and generalizability of our framework, ranking active compounds among inactives in the top 2.8% on average on a structurally distinct drug-like test set. Those predictions are sensitive, but robust to pockets alterations, opening the door to its use on binding site detection methods outputs. Our model results in a ten thousand-times speedup over docking techniques while obtaining higher performance. Finally, we deploy our model on a recently pub- lished in-vitro small molecule microarray experiment with 20,000 compounds and report a mean enrichment factor at 1% of 2.93 on four unseen RNA riboswitches. To our knowledge, this is the first experimental evidence of success for structure- based deep learning methods in RNA virtual screening. Our source code and data, as well as a Google Colab notebook for inference, are available on GitHub.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
* Corrected Fig 5 normalization error. Added dock6 to docking benchmark, included pocket perturbation experiment Fig 3c, d.
* https://github.com/cgoliver/rnamigos2
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer