ARTICLE
Received 3 Oct 2015 | Accepted 24 Mar 2016 | Published 4 May 2016
DOI: 10.1038/ncomms11436 OPEN
Analytic framework for peptidomics applied to large-scale neuropeptide identication
Anna Secher1,2,*, Christian D. Kelstrup1,*, Kilian W. Conde-Frieboes3, Charles Pyke2, Kirsten Raun4,
Birgitte S. Wulff5 & Jesper V. Olsen1
Large-scale mass spectrometry-based peptidomics for drug discovery is relatively unexplored because of challenges in peptide degradation and identication following tissue extraction. Here we present a streamlined analytical pipeline for large-scale peptidomics. We developed an optimized sample preparation protocol to achieve fast, reproducible and effective extraction of endogenous peptides from sub-dissected organs such as the brain, while diminishing unspecic protease activity. Each peptidome sample was analysed by high-resolution tandem mass spectrometry and the resulting data set was integrated with publically available databases. We developed and applied an algorithm that reduces the peptide complexity for identication of biologically relevant peptides. The developed pipeline was applied to rat hypothalamus and identies thousands of neuropeptides and their post-translational modications, which is combined in a resource format for visualization, qualitative and quantitative analyses.
1 Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen, Denmark. 2 Histology and Imaging, Novo Nordisk A/S, Novo Nordisk Park, DK-2760 Maaloev, Denmark. 3 Protein & Peptide Chemistry, Novo Nordisk A/S, Novo Nordisk Park, DK-2760 Maaloev, Denmark. 4 Incretin & Obesity Pharmacology, Novo Nordisk A/S, Novo Nordisk Park, DK-2760 Maaloev, Denmark. 5 Incretin & Obesity Research, Novo Nordisk A/S, Novo Nordisk Park, DK-2760 Maaloev, Denmark. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to J.V.O. (email: mailto:[email protected]
Web End [email protected] ).
NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 1
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436
Analogues of bioactive peptides like glucagon-like peptide 1 (GLP-1) are emerging as prominent drugs for treatment of metabolic disorders such as diabetes and obesity.
As a consequence of this, the analysis of endogenous peptides from tissues holds a great promise for drug discovery including the identication of new, bioactive peptides. Neuropeptides are peptide hormones in the brain, which elicit key signalling responses that affect diverse behavioural end endocrine functions including weight homeostasis, pain and psychiatric disorders1. Neuropeptide research is challenged by difculties in identifying new bioactive neuropeptides, but the emergence of a new generation of high-performance mass spectrometers (MS) makes large-scale identication of endogenous peptides extracted from tissue samples possible, a strategy referred to as peptidomics2,3. This enables unbiased and explorative studies and in principle allows for the identication of post-translational modications (PTMs)4,5 as well as previously undescribed neuropeptides6,7. Analysis of peptidomes has so far been challenged by technical issues due to unspecic protease digestion during sample preparation and computational challenges in data analysis as well as difculties in the biological interpretation. This calls for development of new sample preparation methods and bioinformatic approaches to reliably identify new potential neuropeptides8. Previous studies show that heat inactivation either performed by focused microwave irradiation911, by heating the excised tissue in a conventional microwave oven12 or by specialized controlled-heating instruments13 largely prevents the production of proteolytic peptide fragments when compared with traditional protocols based on snap freezing12,13. Furthermore, several strategies have been reported for identication of neuropeptides from complex and large data sets based on cleavage analysis and de novo sequencing14,15.
Here, we describe a compilation of methods into a simple and robust analytic framework for extracting, analysing and identifying endogenous peptides in rat brain. Different heat inactivation procedures were compared and combined with protease inhibitor perfusion of animals, to further retain intact peptides in the sample. The peptidomes extracts were analysed by single-shot nanoow liquid chromatography in line with high-resolution tandem mass spectrometry. A sequential, computational framework was developed to efciently analyse the resulting large data set in a stringent approach minimizing errors of false peptide identications. As proof-of-concept, the methodology was applied to large-scale neuropeptide identication from rat hypothalamus resulting in thousands of identied neuropeptides. In addition, an abundance of PTMs on these peptides are identied, and these data are combined in a resource format for visualization, qualitative and quantitative analyses.
ResultsBenchmarking of optimized peptidomics workow. To establish the optimal sample preparation method for mass spectrometry-based peptidomics, different published methods for neuropeptide extraction from the tissue12,13,16 were compared in terms of sample recovery and proteolytic breakdown products. A simple denaturing buffer consisting of 8 M urea was utilized as extraction media based on previous publications16, and neuropeptides were enriched from larger protein fragments by centrifugation through 30 kDa cutoff lters before analysis by liquid chromatography tandem mass spectrometry (LC-MS/MS). To allow further sub-dissection of complex biological tissues such as hypothalamus without rapid peptide degradation, heat stabilization by microwaving the excised tissue
immediately following decapitation12 was compared with controlled conductive heating, to generate rapid, homogenous, thermal denaturation of the dissected brain using a commercially available instrument13 and nally in combination with perfusion with protease inhibitors (Fig. 1a). Each of the three protocols were performed as biological quadruplicates (N 4)
and analysed by 2 h LC-MS/MS runs using high-resolution orbitrap tandem mass spectrometry. The complete data set of 12 raw les were analysed together using the MaxQuant software suite (http://http://www.maxquant.org
Web End =www.maxquant.org ), which resulted in the identication of 11,509 modication-specic peptide variants. To assess the reproducibility, a hierarchical clustering of the 1,135 unique neuropeptides mapping to annotated pro-hormone protein precursors and identied in at least 2 of 12 samples were performed and the samples clustered according to sample preparation method (Fig. 1b). To quantify neuropeptide recovery, each cluster was visualized in a Venn diagram (Fig. 1c). When comparing number of peptides from the two heat inactivation methods, the commercial heat stabilization generated a higher number of peptides. To further restrict postmortem proteolytic degradation, perfusion with a protease inhibitor cocktail combined with heat inactivation of the tissue was applied. Addition of this step increased the length of retrieved peptides signicantly more than heat stabilization by itself (Fig. 1d). This optimized sample preparation protocol for rapid peptide extraction is schematized and can be performed within an hour (Fig. 2a).
Large-scale peptidomics applied to hypothalamus. To apply this protocol to large-scale peptidomics, 32 rats were perfused with protease inhibitors and their brains heat inactivated. Endogenous peptides were extracted in parallel from the stabilized hypothalamic tissue by sonication and larger protein fragments removed by molecular weight cutoff spin lters. The resulting peptidomes were analysed by LC-MS/MS resulting in identication of 16,037 unique modication-specic peptide variants covering 14,416 unique peptide sequences (Supplementary Data1). Biological replica had an overlap based on unique peptide sequence of 684% (standard deviation). Including the MaxQuant software feature match between runs, an average overlap of 785% was observed. The high data quality is evidenced by the technical details including low mass errors, high peptide scores and reproducibility (Supplementary Fig. 1).
A large fraction of the peptides were derived from intracellular proteins likely originating from tissue damage and did not constitute neuropeptide candidates. To differentiate between these, the list of peptides was organized by membership to protein families according to an existing mammalian orthologous group framework17. This enabled a general high level aggregation of information from publically available protein databases (Uniprot.org, Swepep18 and Neuropeptides.nl) across several species (Fig. 2b) and provided a curated database of 182 protein families containing pro-hormone precursors, B1% of a total 18,972 orthologous protein groups. Applied to our data, the identied peptides mapped to 786 orthologous protein groups of which 62 belonged to the pro-hormone precursors (Supplementary Data 1).
Analysis of longest peptide variants (LPVs). To reduce the complexity of our data set further, we developed an algorithm that assembled the peptides into LPVs (Fig. 2c). As nonspecic protease activity generates ladder series of peptides with a single amino acid removed, a bioinformatic merge of overlapping sequences enabled a focus on more specic protease activity. This reduced the 14,416 unique peptides to 2,835 LPVs of which
2 NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436 ARTICLE
a
Microwave irradiation
Heat stabilization
+
Perfusion and heat stabilization
Excise hypothalamus
Homogenize andcentrifuge Filter 30 kDa 12 LC-MS/MS
Stagetip
b
c
Microwave
Heat
108 115
97
Perfusion and heat (A)
Perfusion and heat (B)
Perfusion and heat (C)
Perfusion and heat (D)
Microwave (A)
Microwave (B)
Microwave (C)
Microwave (D)
Heat (A)
Heat (B)
Heat (C)
Heat (D)
21
275 190
329
Perfusion and heat
Clusters
d
N = 329
N = 21
N = 108
N = 190
N = 115
N = 97
P< 1e-15
P< 0.01
N = 275
5 10 15 20 25 30 35
Neuropeptide length (amino acid residues)
40
Figure 1 | Comparison of peptidomics sample preparation methods. (a) Workow using microwave-irradiation, heat-stabilization or protease inhibitor perfusion before heat stabilization. Endogenous peptides from rat hypothalamus were extracted by homogenization in an urea-buffer and ltration using a 30-kDa cutoff lter. Peptides were desalted and concentrated on C18-STAGE tips and analysed by LC-MS/MS. (b) Hierarchical clustering of identied neuropeptides. (c) Venn diagram of identied neuropeptides. (d) Analysis of neuropeptide length distributions. Box-plot analysis of neuropeptide length represented as number of amino-acid residues. Illustrations were generated using images from Servier.com.
specically 356 LPVs belonged to the pro-hormone precursor group (Supplementary Data 2). These overlapped with 251 previously annotated neuropeptides of which 45 were found to be identical full-length peptide matches including neuropeptide-Y
and galanin. Furthermore, 105 previously undescribed LPVs were derived from pro-hormone precursors, but without known biological activity. To condense information, the full peptide list was compiled into a database and visualized as presented
NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 3
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436
a
Perfuse with protease inhibitors
Homogenize andcentrifuge Filter 30 kDa
1 min 5 min 20 min 20 min 5 min
Dissect brain and heat inactivate
Stagetip
LC-MS/MS
240 min
b c
All
Pro-hormone precursors
786 62
Protein families
Identified peptides
Families of mammalianorthologous protein groups maNOG
Reduce to species in question
Add functional annotation from databases
Filter for pro-hormone precursor
14,416
2,856
Neuropeptides.nl swepep.org uniprot.org
Redundancy removal
Overlap merge
LPV assembly
2,835
356
Knowns matching
Figure 2 | Analytical framework for analysis of endogenous peptides. (a) Optimized sample preparation protocol for experimental isolation of endogenous peptides from the hypothalamus before mass spectrometric analysis. (b) Computational data analysis was done in the context of orthologous protein groups enabling comparison between species and incorporation of previously annotated peptides from multiple databases and online resources to focus the analysis on a subset of peptides. (c) Generation of LPVs is schematically shown as removal of redundancy and merging of overlapping peptides. Illustrations were generated using images from Servier.com.
(Supplementary Data 3). Essentially all neuropeptides previously described to be present in the hypothalamus were identied including agouti-related peptide, which is only expressed in a very small part of the hypothalamus termed the arcuate nucleus19. Several other well-described neuropeptides were identied including glucagon-like peptide-1, a-melanocyte-stimulating hormone (a-MSH) and somatostatin, as well as bioactive peptides derived from chromogranins, cholecystokinin, secretogranins and cocaine- and amphetamine-regulated transcript. The full peptide list can be found in Supplementary Data 2.
Comparison with other bioinformatic tools. To benchmark our bioinformatic workow described above, we compared it with established approaches such as a curated database of neuropeptides (for example, NeuroPep20) and prediction tools (for example, PeptideRanker21 or NeuroPred22). A direct comparison of our peptide data to the NeuroPep database showed that of the 2,856 unique sequences mapping to pro-hormone precursors, 2,015 sequences were deemed neuropeptides by both approaches and only 34 peptides belonging to Tubulin beta-3 and Nucleobindin-1 were unique to the NeuroPep database, but were not determined as pro-hormone precursors in our workow. Conversely, 841 neuropeptides were only found by our approach originating from more than 30 protein families (Supplementary Fig. 2a). Next, we analysed our peptidomics data set of 14,416 identied peptides with PeptideRanker, which is a peptide-level predictor for bioactivity. A minimal difference in predicted bioactivity was found between the 2,856 peptides from the pro-hormone precursor group and the rest (Supplementary Fig. 2b). However, LPVs derived from pro-hormone precursors showed an overall increase in their predicted bioactivity probability both compared with all peptides and to the other LPVs that achieved
signicantly lower bioactivity probabilities. Similar results were observed when comparing to the protease prediction tool NeuroPred. In this analysis, only 3.6% (103/2,856) of the cleavage sites overlapped between predicted and observed peptides originating from pro-hormone precursors. However, the LPVs from the pro-hormone precursor group showed a signicantly larger overlap of 15.7% (56/356).
Gene ontology analysis of peptidomics data. To evaluate the efciency of neuropeptide recovery and to examine the functional prole of all the identied peptide sequences, a gene ontology (GO) enrichment analysis was performed, where the cellular localization and molecular function of the identied endogenous peptidome was compared with the hypothalamus proteome derived by mass spectrometric analysis of a tryptic digest of the proteins retained in the spin lter (Fig. 3a). The most signicantly enriched molecular function classier in the peptidome was neuropeptide hormone activity, whereas the most under-represented classier was ATP binding, typically representing intra-cellular proteins. For GO classiers of cellular localization, the most enriched in the peptidome was extracellular space, whereas the cytoplasm was most enriched in the reference proteome. The results conformed well to the known function and cellular organization of neuropeptides and underscored that our protocol effectively separated neuropeptides from other polypeptides.
Analysis of protease cleavage preferences. A linear sequence motif analysis revealed that the most prevalent amino acids preceding the N-termini of the peptides from the pro-hormone precursor group were dibasic cleavage sites KR or monobasic site R (Fig. 3b). This overrepresentation of basic amino acids preceding the N-termini was even stronger for the 356 LPVs derived from the pro-hormone precursor group (Fig. 3c).
4 NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436 ARTICLE
a
Neuropeptide hormone activity
Log10 (P, peptidome)
16/22 24/168 17/102 218/1,506
24/168
17/102
Cellular componentMolecular function
Structural constituent of ribosome
Hormone activity
ATP binding
Phospholipid binding
Protein tyrosine kinase activity
Extracellular space
lysosome
61/848 30/201 74/1,600
Extracellular region
736/4,586
464/2,370
77/204
431/3,431
Cytoplasm
Cytosol
Axon
80 20
0 20 40
Plasma membrane
Log10 (P, proteome)
b
c
(bits)
2
1
(bits)
2
1
Yes No Pro-hormone precursor
Yes
1
0 3 2 +1 +2 +3
1
Peptide
No
Pro-hormone precursor
1
0 3 2 1 +1 +2 +3
LPV
0
2
0
2
d e
Pro-hormone precursor
0% 50% 100%
(bits)
2
1
Yes No
Unique peptides
2,856
11,560
197
14
71
Yes No Pro-hormone precursor
1
Amidated peptides
241
130
199
0 3 2 +3
1 +1 +2
Amid. LPV
0
2
Amidated peptides with G in +1
Phosphopeptides
Figure 3 | Neuropeptide feature analysis. (a) Gene ontology enrichment comparing the identied peptidome to its corresponding hypothalamic proteome. (b) Logo plots for the C- and N-terminal regions anking peptides split by pro-hormone precursor group membership. (c) Logo plots for the C- and N-terminal regions anking LPVs split by pro-hormone precursor group membership. (d) Logo plots for the C- and N-terminal regions anking amidated LPVs split by pro-hormone precursor group membership. (e) Overview of number of identications and their modication state. In total, 14,416 unique peptides sequences were found, 20% were found in orthologous protein groups that contained a pro-hormone precursor.
The most abundant residues trailing the C-termini were KR in non-amidated neuropeptides and a C-terminal glycine in amidated neuropeptides followed by K/R, in line with previous reports14,23. C-terminal amidation of endogenous peptides such as neuropeptides is often essential for full biological activity. It is well-established that the bifunctional enzyme peptidyl-glycine alpha-amidating monooxygenase, which is abundantly present in our samples, has the ability to convert peptides that terminate in
glycine to the corresponding des-glycine peptide amide24. These motifs were signicantly overrepresented compared with the remaining LPVs or all peptides (Fig. 3d) validating our grouping approach when compared with prior knowledge of pro-hormone precursors.
Functional analysis of a-MSH phosphorylation. To assign potential biological function to previously undescribed peptides, we ltered the data based on C-terminal amidation, which is a well-described PTM in bioactive neuropeptides25. A full list of potential, biologically active neuropeptides were generated from the LPVs with criteria based on K/R in position -1 preceding the N-terminus and K/R in position 1 and 2 trailing the
C-terminus or C-terminal amidation followed by G in position
1 (Table 1 and Supplementary Data 4).
Besides the 5-fold enrichment in amidated C-termini on neuropeptides, we found an 11-fold over-representation of phosphorylated peptides among the pro-hormone precursor groups (Fig. 3e and Supplementary Data 5). A total of 438 peptides were found to be amidated, of which 55% were found in the neuropeptide pro-hormone precursor group. For phosphorylated peptides, this fraction was 74% out of a total of 270 phosphopeptides. A search against the Uniprot database identied approximately half of the PTMs to be previously undescribed. These modied peptide sequences could potentially represent biologically active peptides or fragments thereof, making them interesting targets to explore further. One of the new PTM identications was a phosphorylation of the second serine [SYS(ph)MEHFRWGKPV-amide] in the a-MSH peptide corresponding to Serine-126 in the full-length POMC pro-hormone precursor protein. To reveal the functional role of this phosphorylation site, we evaluated whether the phosphorylation altered the in vitro activity of the melanocortin receptors (MC1, 3, 4 and 5) targeted by a-MSH as described in ref. 26. The pKi values were calculated from IC50 values determined in radio ligand displacement ltration binding assays to membranes from recombinant BHK570 cells expressing the relevant human melanocortin receptor and using 125I-NDP-a-MSH as radio ligand. Phosphorylation of a-MSH lowered its binding afnity by 11-fold for MC4 and 7- to 8-folds for MC5 and MC1 receptors compared with dephosphorylated form of a-MSH (Fig. 4a).
Concurrently, the phosphorylation lowered the MC4 activated cyclic AMP (cAMP) response by tenfold (Fig. 4b).
Kinase-substrate motif analysis. Sequence motif analysis of the phosphorylation sites revealed a signicant S-x-E motif for the neuropeptide group, including the functional site in a-MSH, whereas an S/T-P motif was found for the remaining phosphopeptides (Fig. 4c). The neuropeptide phosphorylation motif matched perfectly the substrate specicity of the recently described Fam20c protein kinase that phosphorylates secretory pathway proteins within S-x-E motif27. Interestingly, this kinase is also responsible for phosphorylation of the majority of peptides residing within the central nervous system2830. This validated our grouping into secretory and intra-cellular protein families.
Determination of fractional phosphorylation site stoichiometry. To further elucidate the biological function and abundance of phosphorylation in neuropeptides, the occupancy or fractional stoichiometry of all identied phosphorylation sites was estimated under the assumption that phosphorylation does not signicantly change the ionization efciency of the peptide. Phosphorylation site stoichiometry was estimated by comparing spectral counts of phosphopeptides with their dephosphorylated counterparts for each phosphorylation site that was observed at least three times
NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 5
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436
Table 1 | Extract of potential new neuropeptides.
maNOG description Sequence (N-terminal. Peptide sequence amino acids. C-terminal) Chromogranin-A EKR.LEGEDDPDRSMKLSFRARAYGFRDPGPQL.RRG
Chromogranin-A NRR.AEDQELESLSAIEAELEKVAHQLQALRR*.GCocaine- and amphetamine-regulated transcript protein PRR.QLRAPGAVLQIEALQEVLKKLKS.KRICorticotropin-releasing hormone AER.GAEDALGGHQ*.GALGalanin EKR.GWTLNSAGYLLGPHAIDNHRSFSDKHGLTG.KREGalanin GKR.ELPLEVEEGRL*.GSVGlucagon DKR.HSQGTFTSDYSKYLDS.RRAGlucagon GRR.DFPEEVAIAEEL*.GRRKisspeptin-1 VQR.EKDMSAYNWNSFGLRY*.GRRNeuropeptide S MKR.SFRNGVGSGVKKTSF.RRANeurosecretory protein VGF ATR.QAAAQEERLADLASDLLLQYLLQGGARQRDLG*.GRGNeurosecretory protein VGF VRR.LEGSFLGGSEAGERLLQQGLAQVEAG.RRQNucleobindin-2 EKR.KEEEAKFAEM.KRKPituitary adenylate cyclase-activating polypeptide TKR.HSDGIFTDSYSRY.RKQPituitary adenylate cyclase-activating polypeptide YRK.QMAVKKYLAAVL*.GKRProenkephalin-A MKK.DADEGDTLANSSDLLKELLGTGDNRAKDSHQQESTNNDEDSTSKRYGGFMRGL.KRS Proenkephalin-A MKR.YGGFMKKMDELYPVEPEEEANGGEILAKRYGGFM.KKDProenkephalin-A QKR.YGGFMRRV*.GRPProenkephalin-A QKR.YGGFMRRVGRPEWWMDYQKRYGGFL.KRFPro-FMRFamide-related neuropeptide FF FGR.NAWGPWSKEQLSPQAREFWSLAAPQRF*.GKKPro-FMRFamide-related neuropeptide VF SPR.ARANMEAGTMSHFPSLPQRF*.GRTProgonadoliberin-1 DLR.GALERLIEEEA*.GQKProhormone convertase 2 HKR.QLERDPRIKMALQQEGFD.RKKProhormone convertase 2 SKR.NQLHDEVHQW.RRNPro-opiomelanocortin FKR.ELEGEQPDGLEHVLEPDTEKADGPYRVEHFRWGNPPKD.KRY Pro-opiomelanocortin GKK.RRPVKVYPNVAENESwAEAFPLEF.KREPro-opiomelanocortin GKR.SYSwMEHFRWGKPVGKK.RRPProSAAS LRR.AVDQDLGPEVPPENVL*.GALProtachykinin-1 GKR.DAGHGQISHKMAYERSAMQNYE.RRRProtachykinin-1 GKR.DAGHGQISHKRHKTDSFVGLM*.GKRProtachykinin-1 HKR.HKTDSFVGLMG.KRAPro-thyrotropin-releasing hormone ERR.FLWKDLQRVR*.GDLPro-thyrotropin-releasing hormone GKR.EEEEKDIEAEER*.GDLPro-thyrotropin-releasing hormone GKR.EEEEKDIEAEERGDLGEGGAWRLH.KRQPro-thyrotropin-releasing hormone TKR.QHPGRRFIDPELQRSwWEEKEGEGVLMPE.KRQPro-thyrotropin-releasing hormone VKR.QHPGRRSFPWMESDVT.KRQSecretogranin-1 EKR.KRLGALFNPYFDPLQWKNSDFE.KKGSecretogranin-1 EKR.PFSEDVNW*.GYESecretogranin-1 EKR.SFARAPHLDL.KRQSecretogranin-1 LRK.SGKEVKGEEKGENENSKFEVRLLRDPSDASV*.GRWSecretogranin-1 NKR.SEASAKKKEESVARAEAHFVELEKTHSwREQSSQESGEET.RRQSecretogranin-1 TRR.QEKPQELPDQDQSEEESwEEGEEGEEGATSEVT.KRRSecretogranin-1 YKR.NHPDSELESTANRHSw,EETEEERSYEGAKGRQHRGRGREPGAYPALDSwRQE.KRL Secretogranin-2 LKR.VPSPGSSEDDLQEEEQLEQAIKEHL*.GQGSecretogranin-2 LKR.VPSPGSw,SEDDLQEEEQLEQAIKEHLGQGSSwQEMEKLAKVS.KRISecretogranin-2 MKR.SGHLGLPDE*.GNRSecretogranin-2 SKR.IPAGSLKNEDTPNRQYLDEDMLLKVLEYLNQEQAEQ*.GRESecretogranin-2 SKR.IPAGSLKNEDTPNRQYLDEDMLLKVLEYLNQEQAEQGREHLA.KRA Secretogranin-5 RKR.RSVNPYLQ*.GKRTachykinin-3 QKR.DMHDFFVGLMG.KRNUrocortin 3 SKK.NFGYLPTQDPSwGEEEDEQKHIKN.KRTUrocortin 3 VKK.NKLEDVPVLS.KKNVIP peptides GKR.ISSSISEDPVPV.KRHVIP peptides LRK.QMAVKKYLNSILN*.GKR
*Indicates amidated C-terminal. wIndicates phosphorylated amino acid.
across the 32 biological replicates (Fig. 4d). This was a reproducible measure across samples and revealed that a majority of phosphorylation sites were of high occupancy of 4080%, whereas approximately one-third was observed with occupancy below 40%.
DiscussionIn this study, we present a compilation of methods into a swift sample protocol for extraction of neuropeptides, followed by mass spectrometric identication and bioinformatic analysis to tease
out potential biologically active neuropeptides and novel PTMs. To our knowledge, this is the rst comprehensive peptidomics data set that provides a large number of (full-length) neuropeptides from one extraction analysed by single LC-MS/MS analysis. The developed sample preparation protocol is reproducible, minimizes unspecic post-mortem protease digestion issues and can be performed in less than 1 h from tissue dissection to mass spectrometric analysis.
The initial peptide identication process relies on classical shotgun sequencing by data-dependent acquisition and matching
6 NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436 ARTICLE
a b
-MSH
p-MSH
-MSH
p-MSH
4
0 12
10 *** *** * ***
cAMP, pmol/well
9
3
8
pKi
2
7
1
6
MC1 MC3 MC4 MC5
10 8 6 4
Compound conc. [M] (log10-scale)
c
Phosphorylation site logo
Pro-hormone precursor
Non pro-hormone precursor
(bits)
1 0
2
(bits)
1 0
2
4
4
3
3
p4 p3 p2 p1 p4
p3
p2
p1
p4 p3 p2 p1 p4
p3
p2
p1Position relative to phosphorylation
Position relative to phosphorylation
d
Spectral count (phosphorylated)
Spectral count (all)
100
Phosphorylation site occupancy
(%)
80
60
40
20
0
SCG3_RAT (S40)
SMS_RAT (S92)
PDYN_RAT (S177)
SMS_RAT (S89)
PCSK1_RAT (S47)
PCSK1_RAT (S44)
COLI_RAT (S126)
NPY_RAT (S83)
SCG1_RAT (S429)
TRH_RAT (S186)
TKN1_RAT (S73)
PDYN_RAT (S235)
CMGA_RAT (S353)
SCG1_RAT (S362)
SCG2_RAT (S176)
PENK_RAT (S253)
PENK_RAT (S251)
SCG1_RAT (S397)
COLI_RAT (S154)
SCG2_RAT (S558)
SCG2_RAT (S535)
SCG2_RAT (S534)
7B2_RAT (S203)
CART_RAT (S48)
SCG2_RAT (T274)
SCG1_RAT (S190)
SCG2_RAT (S270)
CMGA_RAT (S413)
CMGA_RAT (S417)
SCG1_RAT (S372)
SCG1_RAT (S375)
SCG1_RAT (S100)
SCG1_RAT (S129)
SCG2_RAT (S85)
SCG2_RAT (T149)
SCG1_RAT (S228)
TKN1_RAT (Y30)
Figure 4 | Functional analysis of neuropeptide phosphorylation sites. (a) Ki (shown as negative log, pKi) calculated from IC50 values determined under equilibrium conditions in competition with 125I-NDP-a-MSH on MC1, 3, 4 and 5 receptors, respectively. (ChengPrusoff equation *Po0.05, ***Po0.001). Values are means.e.m. (b) Representative gure of a-MSH and phosphorylated a-MSH-stimulated cAMP production in intact BHK cells expressing the human MC4 receptor. (c) Logo plots of the phosphorylation sites revealed a [ST]xE motif for the neuropeptide protein precursor group. In contrast, a [ST]P motif was found for the remaining phosphopeptides. (d) Occupancy of phosphorylation sites that was observed at least three times. Boxplots depicts variation observed across 32 replicates. Each line contains the Uniprot gene name and phosphorylated position in parentheses.
of resulting tandem mass spectra against a full proteome database. This has the benet of not being tissue or species specic while controlling identication error rates. Compared with using a specialized database such as NeuroPep, our workow that relies on functional ltering of protein families enables the identication of new interesting peptides. A general trend in comparison between our data set and standard prediction algorithms is an overall low overlap. Based on this, sequence information alone seems to be insufcient to estimate bioactivity or cleavage sites. However, collapsing to LPVs improve the overlap for the positive pro-hormone precursor group, in good agreement with the goal to minimize nonspecic peptidase effects on the data. This indicates that prediction tools provide valuable information when used in combination with the workow presented here.
The developed and implemented algorithm reduces the peptide complexity and derives LPVs for identication and prioritization of hundreds of biologically relevant peptides. The analytical
strategy and way of presenting the peptidomics data in Supplementary Data 15 enable multiple views, combined in a resource format for visualization, qualitative and quantitative analysis, as well as hotlist prioritization.
Even though it has been reported that neuropeptide extraction in urea eliminates some peptides such as a-MSH and b-endorphin16, both of these peptides were successfully recovered in full length and identied with the presented sample protocol. Heat inactivation was an efcient method for retrieval of an abundance of neuropeptides but combining this method with perfusion using a cocktail of protease inhibitors increased the recovery by threefold compared with either of the heat-inactivating methods by itself. Furthermore, this compilation of methods greatly increased the length of retrieved peptides, and demonstrated that perfusion with protease inhibitors is a powerful strategy for reducing post mortem proteolytic breakdown products.
The peptidomics experiments reproducibly identied thousands of endogenous neuropeptides originating from pro-hormone protein precursors covering essentially all known hypothalamic peptides and their PTMs including both widespread serine phosphorylation and C-terminal amidation. More than one hundred phosphorylation sites were identied, of which more than half have not been described before. This included a new site on a-MSH, a neuropeptide with a prominent role in appetite regulation. To test the function of this phosphorylation, we performed different cell-based assays and found that the phosphorylation reduced the afnity of a-MSH for its cognate receptors MC1, 3, 4 and 5 by more than tenfold. MC4 receptors are highly expressed in the brain and MC4 agonists such as a-MSH decrease food intake31. Thus, as the identied phosphorylation of a-MSH seems to function through a lowering of MC4 receptor interaction, it may play a role in appetite regulation.
Phosphorylation sites with high occupancy is a general phenomenon observed in specic functional cell states such as mitosis32 and it is in general a very good indication that the site may be functional33. Our data set pinpoints serine phosphorylation as much more abundant on neuropeptides compared with intracellular proteins and that the identied sites are generally of high stoichiometry and conform to the S-x-E motif likely due to the action of a single secreted kinase Fam20C.
Looking forward, the methodological and computational framework we have developed will be applicable to a number of unsolved questions in biology, for example, to further elucidate the prole of secreted peptides in several tissues or to identify the regulated gastric peptidome following either pharmacological intervention of diabetes treatment or gastric bypass, thus taking discovery of new biology and new treatment methods a large step forward.
Methods
Sample preparation. Rat tissues. The study was carried out following approved national regulations in Denmark and with animal experimental license granted by the Animal Experiments Inspectorate, Ministry of Justice, Denmark. Tissue treatment protocols were compared by either microwaving the whole heads12 or by heat inactivating the brains following dissection13. SpragueDawley rats (Crl:SD, male, 200 g, Charles River, Germany) were decapitated, and the heads immediately placed in a conventional microwave oven, dorsal side facing downwards and heated at 800 W for 9 s followed by dissection of the brain (Gr 1). In comparison, four SD rats were decapitated and the brains were quickly dissected and heated to 95 C in an air-evacuated cartridge (Denator T1 Heat Stabilizor, Denator AB; Gr 2). A last set of four SD rats (Gr 3) were anaesthetized with isourane and perfused (1 min, 30 ml min 1) with isotonic saline containing protease inhibitors (0.120 mM
EDTA, 14 mM aprotinin, 0.3 nM valine-pyrrolidide (custom made) and Roche Complete Protease Inhibitor tablets (Roche), pH 7.4) before being decapitated.
The brains were quickly dissected and heated to 95 C in the same air-evacuated cartridge as described above. Following this, hypothalami from all groups were subdissected and kept at 80 C until further use. In the main study, 32 rats were
NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 7
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436
perfused with protease inhibitors and the brains heat inactivated through the same procedure as described for group 3 above.
Peptide extraction. Tissue was sonicated on ice (3 8 s.) in 8 M urea (5 ml mg 1
tissue) and centrifuged at 20,000g, 20 min, 4 C. Supernatant was spun through Microcon YM-30 cutoff lters (Millipore; 20 min, 15,000g at 4 C) pre-rinsed with 2 80 ml 20% MeCN/30% MeOH (2 15 min., 13,000g). The peptides were loaded
onto in-house packed reversed-phase C18 STAGE tips with two Empore C18 discs
preconditioned with 20 ml MeOH, 20 ml 80% MeCN, 0.5% AcOH, 2 20 ml 1%
triuoroacetic acid (TFA), 3% MeCN. Stage tips were washed with 2 20 ml 8%
MeCN, 0.5% AcOH and 1 50 ml 0.5% AcOH.
LC-MS/MS, mass spectrometric analysis. Peptides were eluted into 96-well microtitre plates with 20 ml 40% MeCN, 0.5% AcOH followed by 20 ml 50% MeCN,0.5% AcOH. Organic solvents were removed by vacuum centrifugation in a speed-vac and dried to B2 ml. The peptides were reconstituted with 10 ml of 2% MeCN,0.5% AcOH, 0.1% TFA. Five microlitres of this eluate was analysed by online reversed-phase C18 nanoscale LC-MS/MS on an LTQ-Orbitrap Velos mass spectrometer (Thermo Electron) using a top10 higher-energy collisional dissociation (HCD) fragmentation method as described previously34. The LC-MS analysis was performed with a nanoow Easy nLC system (Proxeon Biosystems) connected through a nano-electrospray ion source to the MS. Peptides were separated by a linear MeCN gradient for 220 min in a 15-cm fused-silica emitter in house packed with reversed-phase ReproSil-Pur C18-AQ 3 mm resin (Dr Maisch GmbH). Full-scan
MS spectra were acquired from 350 to 1,750 m/z at a target value of 1e6 and a resolution of 30,000, and the HCD-MS/MS spectra were recorded at a target value of 5e4 and with resolution of 7,500 using a normalized collision energy of 40%.
Peptide quantication and identication by MaxQuant and Mascot. Raw MS les were processed using the MaxQuant software (ver. 1.0.14.7, Max-Planck Institute of Biochemistry, Department of Proteomics and Signal Transduction, Munich) by which the precursor MS signal intensities were determined and HCD-MS/MS spectra were de-isotoped and ltered such that only the ten most abundant fragments per 100-m/z range were retained. Peptides were identied by searching all MS/MS spectra against a concatenated forward/reversed target/decoy version of a mouse IPI v.3.37 protein sequence database including protein sequences of common observed contaminants like human keratins and porcine trypsin. The HCD-MS/MS spectra were searched with variable modications of oxidation (M), acetylation (prot. N-term), Gln-4pyro-Glu, Amidation (C-term), phospho (STY) and no enzyme specicity required. Search parameters were set to an initial precursor ion tolerance of 7 p.p.m. and MS/MS tolerance at 0.02 Da. Label-free peptide quantication based on extracted ion chromatograms and spectral counts and validation was performed in the MaxQuant software suite35 requiring a minimum Mascot score of 10 at a xed peptide false discovery rate (FDR) threshold of maximum 2.5% and a xed protein FDR threshold of maximum 1.5% to achieve a nal peptide FDRo0.01. Phosphorylation sites were considered localized at a localization probability above 75%. The minimum required peptide length was set to six amino acids and peptide precursors were ltered on individual peptide mass errors after nonlinear post-acquisition recalibration36. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identier PXD002431.
Data analysis. The combined analysis of protein orthologous groups from different species annotated with information from several databases required programming to integrate all layers of information. Here, this was all done in the programming language Perl together with freely available tools for blast and alignment. References are provided below and a simplied script is available as Supplementary Data 6 with instructions for use in Supplementary Note 1 and license information in Supplementary Note 2.
Mammalian orthologous protein groups and mapping scheme. This used the existing non-supervised orthologous protein groups for mammals (maNOGs), which was downloaded from eggNOG 3.0 (http://eggnog.embl.de/version_3.0/
Web End =http://eggnog.embl.de/version_3.0/). The sequence database was downloaded from STRING 9.0.5 (http://string-db.org
Web End =http://string-db.org) together with the alias les for the STRING proteins (http://string-db.org/
Web End =http://string-db.org/). Everything was ltered to only contain rat, mouse and human species identiers. The Uniprot/ Swissprot (http://uniprot.org
Web End =http://uniprot.org) sequences for rat, mouse and human species and the IPI sequences for rat and mouse were downloaded (ftp://ftp.ebi.ac.uk/pub/databases/ IPI). Mapping of these databases to the STRING database was done usingblastp (http://blast.ncbi.nlm.nih.gov/
Web End =http://blast.ncbi.nlm.nih.gov/) and the alias protein le was supplemented with these protein mappings. This created the sequence mapping framework for the maNOGs.
Annotation of orthologous protein groups. To annotate the maNOG framework, information from three different sources were integrated:
Proteins in Uniprot (http://uniprot.org
Web End =http://uniprot.org) that contain the sequence feature
annotation Peptide indicates this part of the protein contains active peptides processed from a larger precursor protein and with a well-dened biological activity. These parts annotated as Peptide was extracted and mapped to the maNOG identier for the species rat, mouse and human.
The Neuropeptide Database (http://www.neuropeptides.nl
Web End =www.neuropeptides.nl) contains a list of gene names
and precursor names. These were extracted and mapped to the maNOG identier.
The Sweden peptide Database (http://www.swepep.org
Web End =www.swepep.org) was acquired and the
annotated peptides were mapped to the maNOG identiers through the Uniprot identier information.
The maNOGs through which any of these entries received an annotation was marked as a pro-hormone precursor and the peptide was stored in a atline format for later retrieval.
Annotation of identied peptides and proteins. Through the mapping scheme described above it was now straightforward to add a level of maNOG information to the list of protein groups identied from the mass spectrometry MaxQuant analysis. Any information on pro-hormone precursor was included in this annotation. A complication was that the identication list describes protein groups in contrast to single proteins. Proteins in such a protein group are indistinguishable based on the observed peptides and was rarely found to map to different maNOGs. In the cases where a single maNOG uniquely could explain all peptides, this identier was kept. Alternatively, if only multiple maNOGs could explain the peptides, a group of maNOGs were formed hereby extendingthe positive annotation from a single maNOG to all maNOGs in the maNOG group.
Sequence alignment. The protein sequences present within a maNOG group were aligned using the software tool Muscle 3.8 (ref. 37). All shorter peptides were subsequently mapped onto this alignment.
Assembly of LPV. Identied peptides belonging to the same maNOG group were found to show a large degree of overlap. To reduce this, all overlapping peptides were iteratively merged to a longer sequence based on the largest overlap between any pair of sequences. In detail, the algorithm was the following:
i. Dene a unique set of peptides mapping to a maNOG entry or group.ii. Dene overlap in sequences as the number of shared amino acids in peptide sequences with no gaps allowed. It is allowed for one sequence to extend in N-terminal and/or one sequence to extend in the C-terminal direction. A minimum length of overlap was set to two amino acids.
iii. Dene a combined peptide sequence as two sequences merged to form the shortest sequence that contains them both while checking that the combined sequence is a part of one of the protein sequences in the maNOG.
iv. Dene site-specic modications as blockers of overlap past the modied residue. Only modications of possible in vivo origin were considered: N-terminal acetylation, pyro-glutamic acid and amidation of C-terminal.
v. Calculate the number of overlap in amino acids between all pair of peptide sequences based on i-iv.
vi. Replace the pair with the largest allowed with the combined sequence.vii. The two previous points are repeated iteratively until as much as possible is combined and no overlapping sequences are found.
viii. The nal list of peptides constitutes the minimum set that can explain all other peptides and is referred to as the LPV.
Calculation of occupancy. A moving window approach where the quantitative measure for each amino acid is based on how many spectra has beenidentied containing this amino acid. This is a form of spectral count on amino-acid level. For phosphorylation, site specicity was kept separate from quantication. This means that all possible phosphorylation sites were from a quantitative perspective considered to be phosphorylated in all peptide sequences that were found phosphorylated. The occupancy was then calculated by dividing the amino-acid spectral count for only phosphorylated peptides compared with an amino-acid count for the matching amino acid when all peptide sequences were included.
Logo plots. Amino-acid sequence logo plots were created using IceLogo 1.2(ref. 38). For the N- and C-terminal plots, the three up- or downstream amino acids were read in for each peptide. The same was done for LPVs or amidated LPVs. In the phosphorylation site plots, the alignment was performed based on the phosphorylated site. For all logo plots, scaling in IceLogo was done based on the amino-acid frequencies found in Mus musculus.
Visualization le. For peptides mapping to pro-hormone precursors, visualization was done as follows. Protein sequences were downloaded from Uniprot and
8 NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436 ARTICLE
Ensembl for rat, mouse and human with their accession identier and le origin kept in a hyperlink. Sequences were aligned within each group as explained above. An in-house Perl script formatted this, where one line was a sequence. Onto this alignment, the identied sequences were mapped rst followed by the LPVs and nally by the known reference peptides from the annotation effort performed under Annotation of Orthologous Groups. Modications for possible in vivo origin, namely pyro-Gln, amidation, acetylation and phosphorylation were included by adding a parenthesis and a modication name. To ease interpretation, everything was exported to Excel, where each entry was a sheet and where colouring was applied to differentiate the different type of sequences and highlight modications. Hyperlinks were used to link to origin of sequences. The visualization result can be found in Supplementary Data 3.
GO analysis. GO enrichment analysis of the peptidome data set was performed using the human orthologues of protein entries identied by at least two peptides and compared with the lter-retained proteome as background with the innateDB software tool (http://www.innatedb.ca
Web End =http://www.innatedb.ca). Enriched pathways and GO-terms for molecular function and cellular component were determined based on their corrected P-values, requiring at least ten input genes in a group. P-valueswere calculated using a hypergeometric test and corrected for multiple testing with a BenjaminiHochberg. A cutoff of Po0.01 was applied and overrepresented terms visualized as bars in Fig. 1d, where the length is calculated as -10*log10(P).
MC receptor assays. MC-binding assays were performed as radioligand displacement ltration binding assays to membranes from recombinant BHK570 cells expressing the relevant human melanocortin receptor using 125I-NDP-a-MSH as radio ligand26. The IC50 values were calculated by nonlinear regression analysis of binding curves (six data points minimum) using the programme GraphPad Prism (GraphPad Software). Ki values were calculated from IC50 according to the Cheng Prusoff equation. Each assay was performed in duplicate. Data points are
average values from the assays and s.e.m. values are indicated with error bars (n 39).
The functional activities of the analogues on the MC receptors were studied in intact BHK cells expressing the human MC receptors by measuring the cAMP production after stimulation with the analogues. The assay was performed in BHK570 cells expressing the MC receptors, which were stimulated with increasing concentrations of potential MC receptor agonists in a range of 10 11 to 10 5 M, and the degree of stimulation of cAMP was measured using the Flash Plate cAMP assay kit (NEN Life Science Products cat no SMP004) using the supplied materials and buffers26. The assay was performed in duplicates and each compound was tested six times. EC50 values were calculated by nonlinear regression analysis of dose response curves (six data points minimum) using GraphPad Prism (GraphPad Software; EC50 of a-MSH and phosphorylated a-MSH were 25 nM (n 7) and 251 nM (n 10)).
For statistical analysis, unpaired two-tailed t-test were used to compare the pEC50 values and unpaired t-test with Welsch correction were used to compare the Emax values.
Synthesis of phospho a-MSH. Peptides were synthesized using standard Fmoc chemistry. Phospho-serine was introduced by using Fmoc-Ser(PO-(OBzl)OH)-OH (Novabiochem). The peptide was cleaved from resin with 95% TFA, 2% triisopropylsilane (TIPS) and 2.5% water, for 2 h and isolated by precipitation with ether. The crude peptides were puried on a reverse phase preparative HPLC using a C18 column (Waters XBridge PrepC18, 5 mm, 50 250 mm2), with an acetonitrile/
water 0.1% TFA gradient from 8 to 28% over 40 min. Fractions containing the pure peptide were collected and lyophilized.
References
1. Hokfelt, T., Bartfai, T. & Bloom, F. Neuropeptides: opportunities for drug discovery. Lancet Neurol. 2, 463472 (2003).
2. Rubakhin, S. S., Romanova, E. V., Nemes, P. & Sweedler, J. V. Proling metabolites and peptides in single cells. Nat. Methods 8, S20S29 (2011).
3. Skold, K. et al. A neuroproteomic approach to targeting neuropeptides in the brain. Proteomics 2, 447454 (2002).
4. Yamaguchi, H. et al. Peptidomic identication and biological validation of neuroendocrine regulatory peptide-1 and-2. J. Biol. Chem. 282, 2635426360 (2007).
5. An, Z. M., Chen, Y. D., Koomen, J. M. & Merkler, D. J. A mass spectrometry-based method to screen for a-amidated peptides. Proteomics 12, 173182 (2012).
6. Che, F. Y. et al. Identication of peptides from brain and pituitary of Cpe(fat)/ Cpe(fat) mice. Proc. Natl Acad. Sci. USA 98, 99719976 (2001).
7. Fricker, L. D. et al. Identication and characterization of proSAAS, a granin-like neuroendocrine peptide precursor that inhibits prohormone processing. J. Neurosci. 20, 639648 (2000).
8. Buchberger, A., Yu, Q. & Li, L. Advances in mass spectrometric tools for probing neuropeptides. Annu. Rev. Anal. Chem. (Palo Alto Calif.) 8, 485509 (2015).
9. Theodorsson, E., Stenfors, C. & Mathe, A. A. Microwave irradiation increases recovery of neuropeptides from brain tissues. Peptides 11, 11911197 (1990).
10. Mathe, A. A., Stenfors, C., Brodin, E. & Theodorsson, E. Neuropeptides in brain: effects of microwave irradiation and decapitation. Life Sci. 46, 287293 (1990).
11. Nylander, I., Stenfors, C., Tan-No, K., Mathe, A. A. & Terenius, L. A comparison between microwave irradiation and decapitation: basal levels of dynorphin and enkephalin and the effect of chronic morphine treatment on dynorphin peptides. Neuropeptides 31, 357365 (1997).
12. Che, F. Y., Lim, J., Pan, H., Biswas, R. & Fricker, L. D. Quantitative neuropeptidomics of microwave-irradiated mouse brain and pituitary. Mol. Cell Proteomics. 4, 13911405 (2005).
13. Svensson, M. et al. Heat stabilization of the tissue proteome: a new technology for improved proteomics. J. Proteome. Res. 8, 974981 (2009).
14. Falth, M. et al. Neuropeptidomics strategies for specic and sensitive identication of endogenous peptides. Mol. Cell Proteomics. 6, 11881197 (2007).
15. Costa, E. P., Menschaert, G., Luyten, W., De Grave, K. & Ramon, J. PIUS: peptide identication by unbiased search. Bioinformatics 29, 19131914 (2013).
16. Altelaar, A. F., Mohammed, S., Brans, M. A., Adan, R. A. & Heck, A. J. Improved identication of endogenous peptides from murine nervous tissue by multiplexed peptide extraction methods and multiplexed mass spectrometric analysis. J. Proteome Res. 8, 870876 (2009).
17. Powell, S. et al. eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res. 42, D231D239 (2014).
18. Falth, M. et al. SwePep, a database designed for endogenous peptides and mass spectrometry. Mol. Cell Proteomics. 5, 9981005 (2006).
19. Broberger, C., Johansen, J., Johansson, C., Schalling, M. & Hokfelt, T. The neuropeptide Y/agouti gene-related protein (AGRP) brain circuitry in normal, anorectic, and monosodium glutamate-treated mice. Proc. Natl Acad. Sci. USA 95, 1504315048 (1998).
20. Wang, Y. et al. NeuroPep: a comprehensive resource of neuropeptides. Database (Oxford) 2015, bav038 (2015).
21. Mooney, C., Haslam, N. J., Pollastri, G. & Shields, D. C. Towards the improved discovery and design of functional peptides: common features of diverse classes permit generalized prediction of bioactivity. PloS one 7, e45012 (2012).
22. Southey, B. R., Amare, A., Zimmerman, T. A., Rodriguez-Zas, S. L. & Sweedler,J. V. NeuroPred: a tool to predict cleavage sites in neuropeptide precursors and provide the masses of the resulting peptides. Nucleic Acids Res. 34, W267W272 (2006).23. Lindberg, I. & Hutton, J. C. in Peptide Biosynthesis and Processing (ed. Fricker,L. D.) (CRC Press, 1991).24. Bradbury, A. F., Finnie, M. D. & Smyth, D. G. Mechanism of C-terminal amide formation by pituitary enzymes. Nature 298, 686688 (1982).
25. Tatemoto, K., Carlquist, M. & Mutt, V. Neuropeptide Y--a novel brain peptide with structural similarities to peptide YY and pancreatic polypeptide. Nature 296, 659660 (1982).
26. Conde-Frieboes, K. et al. Identication and in vivo and in vitro characterization of long acting and melanocortin 4 receptor (MC4-R) selective alpha-melanocyte-stimulating hormone (alpha-MSH) analogues. J. Med. Chem. 55, 19691977 (2012).
27. Tagliabracci, V. S. et al. Secreted kinase phosphorylates extracellular proteins that regulate biomineralization. Science 336, 11501153 (2012).
28. Salvi, M., Cesaro, L., Tibaldi, E. & Pinna, L. A. Motif analysis of phosphosites discloses a potential prominent role of the Golgi casein kinase (GCK) in the generation of human plasma phospho-proteome. J. Proteome Res. 9, 33353338 (2010).
29. Bahl, J. M., Jensen, S. S., Larsen, M. R. & Heegaard, N. H. Characterization of the human cerebrospinal uid phosphoproteome by titanium dioxide afnity chromatography and mass spectrometry. Anal. Chem. 80, 63086316 (2008).
30. Zhou, W. et al. An initial characterization of the serum phosphoproteome.J. Proteome Res. 8, 55235531 (2009).31. Schwartz, M. W., Woods, S. C., Porte, Jr D., Seeley, R. J. & Baskin, D. G. Central nervous system control of food intake. Nature 404, 661671 (2000).
32. Olsen, J. V. et al. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci. Signal. 3, ra3 (2010).
33. Olsen, J. V. & Mann, M. Status of large-scale analysis of post-translational modications by mass spectrometry. Mol. Cell Proteomics. 12, 34443452 (2013).
34. Olsen, J. V. et al. A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Mol. Cell Proteomics 8, 27592769 (2009).
35. Cox, J. & Mann, M. MaxQuant enables high peptide identication rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantication. Nature Biotechnol. 26, 13671372 (2008).
36. Vizcaino, J. A. et al. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nature Biotechnol. 32, 223226 (2014).
NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 9
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11436
37. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 17921797 (2004).
38. Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J. & Gevaert, K. Improved visualization of protein consensus sequences by iceLogo. Nat. Methods 6, 786787 (2009).
Acknowledgements
Work at the Center for Protein Research is supported nancially by the Novo Nordisk Foundation (Grant agreement NNF14CC0001). This work was also supported by the Sapere Aude Research Leader grant to J.V.O. Part of this work has been funded by PRIME-XS, a seventh Framework Programme of the European Union (contract no. 262067-PRIME-XS). The peptidomics technology developments applied here are part of a project that has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 686547. A.S. was supportedby a STAR postdoctoral fellowship from Novo Nordisk to work at the Center for Protein Research.
Author contributions
A.S., C.D.K., C.P., K.R., B.S.W. and J.V.O. conceived and devised the experiments. K.W.C.-F. prepared synthetic peptides. B.S.W. carried out the cell-based assays.
A.S., C.D.K. and J.V.O. performed the mass spectrometric experiments and analysed the data. A.S., C.D.K. and J.V.O. wrote the manuscript, with discussion and input from all authors.
Additional information
Accession codes: The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository36 with the dataset identier PXD002431.
Supplementary Information accompanies this paper at http://www.nature.com/naturecommunications
Web End =http://www.nature.com/ http://www.nature.com/naturecommunications
Web End =naturecommunications
Competing nancial interests: Novo Nordisk has commercial interest in neuropeptides for the treatment of diabetes and obesity. A.S., K.W.C.-F., C.P., K.R. and B.S.W. are full-time employees of Novo Nordisk and hold minor share portions as part of their employment. J.V.O. consults for and has research funding from Novo Nordisk. C.D.K. consults for Novo Nordisk.
Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/
Web End =http://npg.nature.com/ http://npg.nature.com/reprintsandpermissions/
Web End =reprintsandpermissions/
How to cite this article: Secher, A. et al. Analytic framework for peptidomics applied to large-scale neuropeptide identication. Nat. Commun. 7:11436doi: 10.1038/ncomms11436 (2016).
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the articles Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Web End =http://creativecommons.org/licenses/by/4.0/
10 NATURE COMMUNICATIONS | 7:11436 | DOI: 10.1038/ncomms11436 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright Nature Publishing Group May 2016
Abstract
Large-scale mass spectrometry-based peptidomics for drug discovery is relatively unexplored because of challenges in peptide degradation and identification following tissue extraction. Here we present a streamlined analytical pipeline for large-scale peptidomics. We developed an optimized sample preparation protocol to achieve fast, reproducible and effective extraction of endogenous peptides from sub-dissected organs such as the brain, while diminishing unspecific protease activity. Each peptidome sample was analysed by high-resolution tandem mass spectrometry and the resulting data set was integrated with publically available databases. We developed and applied an algorithm that reduces the peptide complexity for identification of biologically relevant peptides. The developed pipeline was applied to rat hypothalamus and identifies thousands of neuropeptides and their post-translational modifications, which is combined in a resource format for visualization, qualitative and quantitative analyses.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer