Abstract

Insect-borne diseases kill >0.5 million people annually. Currently available repellents for personal or household protection are limited in their efficacy, applicability, and safety profile. Here, we describe a machine-learning-driven high-throughput method for the discovery of novel repellent molecules. To achieve this, we digitized a large, historic dataset containing repellency data for ~14,000 molecules. We then trained a graph neural network (GNN) to map molecular structure and repellency in 2 mosquito species. We applied this model to select 329 candidate molecules to test in high-throughput behavioral assays, quantifying repellency in multiple pest species and in follow-up trials with human volunteers. The GNN approach outperformed a chemoinformatic model, and produced a hit rate that increased with training data size, suggesting that both model innovation and novel data collection were integral to predictive accuracy. We identified >10 molecules with repellency similar to or greater than the most widely used repellents. This approach enables computational screening of billions of possible molecules to identify empirically tractable numbers of candidate repellents, leading to accelerated progress towards solving a global health challenge.

Competing Interest Statement

KJD owns stock in TropIQ.

Footnotes

* Added Orcid for Jennifer Wei

Details

Title
A deep learning and digital archaeology approach for mosquito repellent discovery
Author
Wei, Jennifer N; Vlot, Marnix; Sanchez-Lengeling, Benjamin; Lee, Brian K; Berning, Luuk; Vos, Martijn W; Henderson, Rob Wm; Qian, Wesley W; Ando, D Michael; Groetsch, Kurt M; Gerkin, Richard C; Wiltschko, Alexander B; Dechering, Koen J
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2022
Publication date
Dec 7, 2022
Publisher
Cold Spring Harbor Laboratory Press
Source type
Working Paper
Language of publication
English
ProQuest document ID
2709396725
Copyright
© 2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.