Abstract

Bacteria of the genera Photorhabdus and Xenorhabdus produce a plethora of natural products to support their similar symbiotic lifecycles. For many of these compounds, the specific bioactivities are unknown. One common challenge in natural product research when trying to prioritize research efforts is the rediscovery of identical (or highly similar) compounds from different strains. Linking genome sequence to metabolite production can help in overcoming this problem. However, sequences are typically not available for entire collections of organisms. Here we perform a comprehensive metabolic screening using HPLC-MS data associated with a 114-strain collection (58 Photorhabdus and 56 Xenorhabdus) from across Thailand and explore the metabolic variation among the strains, matched with several abiotic factors. We utilize machine learning in order to rank the importance of individual metabolites in determining all given metadata. With this approach, we were able to prioritize metabolites in the context of natural product investigations, leading to the identification of previously unknown compounds. The top three highest-ranking features were associated with Xenorhabdus and attributed to the same chemical entity, cyclo(tetrahydroxybutyrate). This work addresses the need for prioritization in high-throughput metabolomic studies and demonstrates the viability of such an approach in future research.

Details

Title
Focused natural product elucidation by prioritizing high-throughput metabolomic studies with machine learning
Author
Tobias, Nicholas; Parra-Rojas, Cesar; Yan-Ni, Shi; Yi-Ming, Shi; Simonyi, Svenja; Thanwisai, Aunchalee; Vitta, Apichat; Chantratita, Narisara; Hernandez-Vargas, Esteban A; Bode, Helge B
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2019
Publication date
Jan 31, 2019
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2174084442
Copyright
© 2019. This article is published under http://creativecommons.org/licenses/by-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.