Abstract

The identification of molecular structure is essential for understanding chemical diversity and for developing drug leads from small molecules. Nevertheless, the structure elucidation of small molecules by Nuclear Magnetic Resonance (NMR) experiments is often a long and non-trivial process that relies on years of training. To achieve this process efficiently, several spectral databases have been established to retrieve reference NMR spectra. However, the number of reference NMR spectra available is limited and has mostly facilitated annotation of commercially available derivatives. Here, we introduce DeepSAT, a neural network-based structure annotation and scaffold prediction system that directly extracts the chemical features associated with molecular structures from their NMR spectra. Using only the 1H-13C HSQC spectrum, DeepSAT identifies related known compounds and thus efficiently assists in the identification of molecular structures. DeepSAT is expected to accelerate chemical and biomedical research by accelerating the identification of molecular structures.

Details

Title
DeepSAT: Learning Molecular Structures from Nuclear Magnetic Resonance Data
Author
Kim, Hyun Woo 1 ; Zhang, Chen 2 ; Reher, Raphael 3 ; Wang, Mingxun 4 ; Alexander, Kelsey L. 5 ; Nothias, Louis-Félix 6 ; Han, Yoo Kyong 7 ; Shin, Hyeji 7 ; Lee, Ki Yong 8 ; Lee, Kyu Hyeong 9 ; Kim, Myeong Ji 9 ; Dorrestein, Pieter C. 10 ; Gerwick, William H. 11 ; Cottrell, Garrison W. 12 

 University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); Dongguk University-Seoul, College of Pharmacy and Integrated Research Institute for Drug Development, Gyeonggi-Do, Republic of Korea (GRID:grid.255168.d) (ISNI:0000 0001 0671 5021) 
 University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California, Department of Computer Science and Engineering, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of Marburg, Institute of Pharmaceutical Biology and Biotechnology, Marburg, Germany (GRID:grid.10253.35) (ISNI:0000 0004 1936 9756) 
 University of California San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); Ometa Labs LLC, San Diego, USA (GRID:grid.266100.3); University of California Riverside, Department of Computer Science, Riverside, USA (GRID:grid.266097.c) (ISNI:0000 0001 2222 1582) 
 University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California San Diego, Department of Chemistry and Biochemistry, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
 Université Côte d’Azur, CNRS, Institut de Chimie de Nice, UMR 7272, Nice, France (GRID:grid.460782.f) (ISNI:0000 0004 4910 6551) 
 Korea University, College of Pharmacy, Sejong, Republic of Korea (GRID:grid.222754.4) (ISNI:0000 0001 0840 2678) 
 University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); Korea University, College of Pharmacy, Sejong, Republic of Korea (GRID:grid.222754.4) (ISNI:0000 0001 0840 2678) 
 Dongguk University-Seoul, College of Pharmacy and Integrated Research Institute for Drug Development, Gyeonggi-Do, Republic of Korea (GRID:grid.255168.d) (ISNI:0000 0001 0671 5021) 
10  University of California San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
11  University of California San Diego, Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242); University of California San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
12  University of California, Department of Computer Science and Engineering, San Diego, USA (GRID:grid.266100.3) (ISNI:0000 0001 2107 4242) 
Pages
71
Publication year
2023
Publication date
Dec 2023
Publisher
Springer Nature B.V.
e-ISSN
1758-2946
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2847159843
Copyright
© The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.