Abstract

Peptides play important roles in regulating biological processes and form the basis of a multiplicity of therapeutic drugs. To date, only about 300 peptides in human have confirmed bioactivity, although tens of thousands have been reported in the literature. The majority of these are inactive degradation products of endogenous proteins and peptides, presenting a needle-in-a-haystack problem of identifying the most promising candidate peptides from large-scale peptidomics experiments to test for bioactivity. To address this challenge, we conducted a comprehensive analysis of the mammalian peptidome across seven tissues in four different mouse strains and used the data to train a machine learning model that predicts hundreds of peptide candidates based on patterns in the mass spectrometry data. We provide in silico validation examples and experimental confirmation of bioactivity for two peptides, demonstrating the utility of this resource for discovering lead peptides for further characterization and therapeutic development.

Bioactive peptides regulate many physiological functions but progress in discovering them has been slow. Here, the authors use a machine learning framework to predict mammalian peptide candidates from the global and local structure of large-scale tissue-specific mass spectrometry data.

Details

Title
Combining mass spectrometry and machine learning to discover bioactive peptides
Author
Madsen, Christian T. 1   VIAFID ORCID Logo  ; Refsgaard, Jan C. 2   VIAFID ORCID Logo  ; Teufel, Felix G. 1   VIAFID ORCID Logo  ; Kjærulff, Sonny K. 2 ; Wang, Zhe 3 ; Meng, Guangjun 4 ; Jessen, Carsten 1   VIAFID ORCID Logo  ; Heljo, Petteri 1 ; Jiang, Qunfeng 5 ; Zhao, Xin 3 ; Wu, Bo 6 ; Zhou, Xueping 7   VIAFID ORCID Logo  ; Tang, Yang 8 ; Jeppesen, Jacob F. 1 ; Kelstrup, Christian D. 1 ; Buckley, Stephen T. 1 ; Tullin, Søren 9 ; Nygaard-Jensen, Jan 9 ; Chen, Xiaoli 10 ; Zhang, Fang 11 ; Olsen, Jesper V. 12   VIAFID ORCID Logo  ; Han, Dan 13 ; Grønborg, Mads 1 ; de Lichtenberg, Ulrik 14   VIAFID ORCID Logo 

 Global Research Technologies, Novo Nordisk A/S, Maaloev, Denmark (GRID:grid.425956.9) (ISNI:0000 0004 0391 2646) 
 Global Research Technologies, Novo Nordisk A/S, Maaloev, Denmark (GRID:grid.425956.9) (ISNI:0000 0004 0391 2646); Intomics, Kongens Lyngby, Denmark (GRID:grid.425956.9) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.425956.9) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.425956.9); Pulmongene LTD. Rm 502, Beijing, China (GRID:grid.425956.9) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.425956.9); Innovent Biologics, Inc. DongPing Jie 168, Suzhou, China (GRID:grid.425956.9) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.425956.9); QL Biopharmaceutical, Rm 101, Beijing, China (GRID:grid.425956.9) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.425956.9); Crinetics pharmaceuticals, San Diego, USA (GRID:grid.421648.d) (ISNI:0000 0004 5997 3165) 
 Novo Nordisk Research Centre China, Beijing, China (GRID:grid.421648.d); Roche R&D Center (China) Ltd, Pudong, China (GRID:grid.421648.d) 
 Global Research Technologies, Novo Nordisk A/S, Maaloev, Denmark (GRID:grid.425956.9) (ISNI:0000 0004 0391 2646); Boehringer Ingelheim GmbH & Co. KG, Biberach, Germany (GRID:grid.420061.1) (ISNI:0000 0001 2171 7500) 
10  Novo Nordisk Research Centre China, Beijing, China (GRID:grid.420061.1) 
11  Novo Nordisk Research Centre China, Beijing, China (GRID:grid.420061.1); Structure Therapeutics. 701 Gateway Blvd., South San Francisco, USA (GRID:grid.420061.1) 
12  The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Department of Proteomics, Copenhagen, Denmark (GRID:grid.5254.6) (ISNI:0000 0001 0674 042X) 
13  Novo Nordisk Research Centre China, Beijing, China (GRID:grid.5254.6) 
14  Global Research Technologies, Novo Nordisk A/S, Maaloev, Denmark (GRID:grid.425956.9) (ISNI:0000 0004 0391 2646); The Novo Nordisk Foundation, Tuborg Havnevej 19, Hellerup, Denmark (GRID:grid.487026.f) (ISNI:0000 0000 9922 7627) 
Publication year
2022
Publication date
2022
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2726686106
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.