Abstract

The process of finding molecules that bind to a target protein is a challenging first step in drug discovery. Crystallographic fragment screening is a strategy based on elucidating binding modes of small polar compounds and then building potency by expanding or merging them. Recent advances in high-throughput crystallography enable screening of large fragment libraries, reading out dense ensembles of fragments spanning the binding site. However, fragments typically have low affinity thus the road to potency is often long and fraught with false starts. Here, we take advantage of high-throughput crystallography to reframe fragment-based hit discovery as a denoising problem -- identifying significant pharmacophore distributions from a fragment ensemble amid noise due to weak binders -- and employ an unsupervised machine learning method to tackle this problem. Our method screens potential molecules by evaluating whether they recapitulate those fragment-derived pharmacophore distributions. We retrospectively validated our approach on an open science campaign against SARS-CoV-2 main protease (Mpro), showing that our method can distinguish active compounds from inactive ones using only structural data of fragment-protein complexes, without any activity data. Further, we prospectively found novel hits for Mpro and the Mac1 domain of SARS-CoV-2 non-structural protein 3. More broadly, our results demonstrate how unsupervised machine learning helps interpret high throughput crystallography data to rapidly discover of potent chemical modulators of protein function.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* https://github.com/wjm41/fresco

Details

Title
Fragment-Based Hit Discovery via Unsupervised Learning of Fragment-Protein Complexes
Author
Mccorkindale, William J; Ahel, Ivan; Barr, Haim; Correy, Galen J; Fraser, James S; London, Nir; Schuller, Marion; Shurrush, Khriesto; Lee, Alpha Albert
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2022
Publication date
Nov 24, 2022
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2739563509
Copyright
© 2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.