Peptidomics for Studying Limited Proteolysis - Journal of Proteome

Oct 13, 2015 - Limited proteolysis is a pivotal mechanism regulating protein functions. Identifying physiologically or pathophysiologically relevant c...
0 downloads 9 Views 631KB Size
Subscriber access provided by University of Otago Library

Article

Peptidomics for Studying Limited Proteolysis Takashi Tsuchiya, Tsukasa Osaki, Naoto Minamino, and Kazuki Sasaki J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.5b00820 • Publication Date (Web): 13 Oct 2015 Downloaded from http://pubs.acs.org on October 13, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Peptidomics for Studying Limited Proteolysis Takashi Tsuchiya1, Tsukasa Osaki1†, Naoto Minamino1, Kazuki Sasaki1*

Department of Molecular Pharmacology, National Cerebral and Cardiovascular Center, Osaka 565-8565, Japan

ACS Paragon Plus Environment

1

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 38

ABSTRACT: Limited proteolysis is a pivotal mechanism regulating protein functions. Identifying physiologicallyor pathophysiologicallyrelevant cleavage sites helps to develop molecular tools that can be used for diagnostics or therapeutics. During proteolysis of secretory and membrane proteins, part of the cleaved protein is liberated and destined to undergo degradation but should retain original cleavage sites created by proteolytic enzymes. We profiled endogenous peptides accumulated for four hours in media conditioned by primary cultured rat cardiac fibroblasts. A total of 3916 redundant peptide sequences from 94 secretory proteins and membrane proteins served to identifylimited cleavage sites, both annotated and unannotated, for signal peptide or propeptide removal, peptide hormone processing, ectodomain shedding and regulated intramembrane proteolysis. This peptide profiling also elucidated cleavage sites different from annotated ones, mostly with signal peptide cleavage, fortypical proteins including extracellular matrix proteins and the peptide hormone precursor ADM. The revealed signal peptide cleavage site for ADM was experimentally verified by identifying the major molecular form of flanking proadrenomedullin N-terminal peptide. We suggest that profiling of endogenous peptides, like transcriptome sequence reads, makes sense in regular cells like fibroblasts, and that peptidomics provides insight intoproteolysis-regulated protein functions.

KEYWORDS: ADM/adrenomedullin, cardiac fibroblasts, ectodomain shedding, endogenous peptide, peptidomics, limited proteolysis, peptide profiling, regulated intramembrane proteolysis, secretome

ACS Paragon Plus Environment

2

Page 3 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION Limited proteolysis is an irreversible but important cellular mechanism by which a precursor protein becomes functionally active.1-3Identifying biologically relevant cleavage sites helps to develop molecular tools that can be used for diagnostics or therapeutics. Being distinct from degradative proteolysis, limited proteolysis involves protein cleavage at specific sites by converting/processing enzymes. Several proteomic approaches have been proposed for a given protease to identify substrates and their exact cleavage sites.4-5 However, these strategies do not deal with the in vivo limited proteolysis taking place on intact proteins in the presence of a variety of proteases and endogenous protease inhibitors. Hence, using cell culture models would provide a better insight into in vivo proteolysis events. The most straightforward approach to cleavage site determination would be to analyze and identify cleaved polypeptides in their native form. The peptides liberated from secretory proteins and transmembrane proteins are extracellularly released and harvested noninvasively in a cell culture system for sequencing. We previously profiled endogenous peptides quickly discharged from secretory vesicles of endocrine cells.6 Even though the peptide snapshot consisted of an abundance of truncated peptides from first-hand, bona fide processing products, we were able to confirm previously described processing sites of secretory proteins including peptide hormone precursors, processing enzymes and granin-like proteins. This peptide profiling, assisted by functional assays, led to the discovery of biologically active C-terminally amidated peptides from several secretory proteins that have not thoroughly been studied in terms of precursor processing.7-10 In multicellular organisms, cells having secretory vesicles such as endocrine cells represent a very limited population. Most cell types lack these organelles, secreting peptides

ACS Paragon Plus Environment

3

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

and proteins constantly. Some of them are now considered a significant producer of biologically active peptides, as evidenced by vascular endothelial cells secreting endothelin-1 and C-type natriuretic peptide.11,12It is therefore critical to extend peptidomic applications beyond endocrine cells. However, it has long been thought that when applied to regular cells, peptide profiling would not yield significant findings because peptides are prone to degradation once liberated from parent proteins. Previous profiling work, conducted on a human cell line and rat primary cultured cells13,14, did not strongly support the importance and utility of endogenous peptide profiling partly because of technical difficulty in MS identification of endogenous peptides. In the present study we used primary cultured cardiac fibroblasts as a model system.By adopting MS/MS parameters that could address issues specifically associated with endogenous peptide identification, we provide one of the most comprehensive peptidomic data for regular cells devoid of secretory vesicles. Our data helped to identify biologically relevant limited proteolysis sites for signal peptide removal, propeptide removal, peptide hormone processing, and domain shedding, demonstrating that peptidomics contributes to gain insight into proteolysis-regulated proteins functions. .

EXPERIMENTAL PROCEDURES Sample preparation Non-cardiomyocytes were prepared from Day 1 rat neonatal hearts as previously described15, passagedonce and grown in DMEM supplemented with 10% fetal bovine serum (Invitrogen) in a 10-cm dish. Four biological replicates were prepared on different days. The confluent culture was rinsed with serum-free DMEM and cultured for a further four hours to obtain

ACS Paragon Plus Environment

4

Page 5 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

conditioned medium. Supernatant after centrifugation was extracted and eluted in 60% acetonitrile (ACN), 0.1% trifluoroacetic acid (TFA)using an RP-1 solid phase extraction cartridge (GL Sciences, Japan) as described.6 After lyophilization, samples were dissolved in 60% ACN, 0.1% TFAand applied to a TSK gel G2000SWXL gel-filtration column (21.5 x 300 mm; TOSOH, Japan) equilibrated with the same solution. The resultant peptide-rich fractions were analyzed by LC-MS/MS.

LC-MS/MS analysis LC-MS/MS was conducted on a NanoFrontier system (Hitachi, Japan) connected online with an Orbitrap XL mass spectrometer (Thermofisher Scientific, CA).About 500 ng of the peptide mixture was loaded onto a trap column and separated on a MonoCap C18 HighResolution (0.075 x 750 mm, GL Sciences). A linear gradient of 5% to 40% of acetonitrile in the presence of 0.1% formic acid over 5 to 6 h was used, with a flow rate of 200nL/min.A protonated ion of polycyclodimethylsiloxane(m/z 445.1200) was used for internal calibration. After a full scan (m/z 400–1300)six most intense ions were chosen for MS/MS. Scans were all recorded in the Orbitrap with a resolution of 60,000 at m/z 400. Ions were isolated with an isolation window of 5 m/z units and provided on a dynamic exclusion list for 2400 s after selected for at least two MS/MS scans. Singly charged precursors were excluded. Monoisotopic selection was disabled with an exclusion window setting of 1.2m/z.Automatic gain control was used to accumulate sufficient fragment ions (MS/MS target value, 2E5; maximum injection time, 750ms). ETD MS/MS was used to identify modification sites of phosphorylated peptides as described.10Briefly, the three most intense precursor ions were subjected to MS/MS with an ETD activation time of 90 ms for doubly charged ions, with charge-dependent activation time and supplemental activation enabled.

ACS Paragon Plus Environment

5

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 38

ETD spectra were acquired using one microscan per spectrum. Automatic gain control parameters used were MS/MS target values of 5E5 and 1E6 for MS/MS and fluoranthene, respectively, with a maximum fill time of1500 ms. One preparation (Replicate 1) was subjected to reductive alkylation with dithiothreitol and iodoacetamide, but the remaining three were not.

Dataanalysis and peptide identification Peak lists were generated with Mascot Distiller (version 2.1.1.0) using a default parameter set for Orbitrap (high resolution for both MS1 and MS2). Mascot (version 2.4) was used to search the deconvoluted MS/MS spectra,with no enzyme specification. Precursor mass was tolerance of 2 ppm and product ion mass tolerance of 50 mmu. Peptides with a Mascot score above the identity threshold (corresponding to an expectation value below 0.05) were considered identified.To help identify disulfide-linked peptides, we introduced Cys-H as a variable modification, in which every cysteine was given the monoisotopic value minus 1.0078. Data were first searched against the UniProtrat database (SwissProt 2014_07, 7,920 sequences), using four variable modifications of N-terminal acetylation (N-acetyl), Cterminal amidation(C-amide), pyroglutamylastion (N-term Q) and Cys-H. No-hit queries after the initial search were searched with each of the following three parameters: 1) Nacetyl, C-amide, N-term Q, oxidated M, hydroxylated K, hydroxylated P, and Cys-H; 2) Nacetyl, C-amide, N-term Q, Cys-H, and S or T phosphorylation. The rest of the queries were searched against the NCBI rat database (NCBInr20140801, 84,256 sequences). Any precursor was excluded from the identification lists if represented only by a single peptide identified only once in the analysis.

ACS Paragon Plus Environment

6

Page 7 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Immunoprecipitation using anti-PAMP antibody Conditioned media of primary cultured cardiac fibroblasts (passaged once, approximately 3 x 107 cells) were extracted with RP-1 cartridges and eluted in 60% ACN/0.1% TFA. Lyophilized samples were reconstituted in 500 µL of PBS at pH 7.4 and incubated overnight at 4oC with 2 µL of PAMP C-terminal antibody (provided by Dr. Kazuo Kitamura, Miyazaki University School of Medicine, Japan). Subsequently, the solution was mixed with 20 ul of slurry of protein A-Sepharose CL-4B (Amersham) for 2 hours at 4oC. The beads were extensively washed four times with PBS, followed by addition of 30 µL of 0.2% formic acid (FA) to release bound peptides.The eluate was applied to a preconditioned C-Tip for desalting.The bound material was washed with 3% ACN/0.1% FAand then eluted in 20 µL of 30% ACN/0.1% FA. The eluate was concentrated with a speedvac and applied to a MALDI target plate. After being mounted withα-cyano-4-hydroxycinamic acid, the plate was analyzed on a 5800 TOF-TOF mass analyzer (AB Sciex). MS spectra were acquired in positive reflector mode in the m/z range between 900 to 5000.

RESULTS and DISCUSSION

Profile of Secretory Peptides from Primary Cultured Cardiac Fibroblasts The conditioned medium from the fibroblast culture was extracted with a solid-phase extraction column and further separated by gel filtration to isolate peptide-rich fractions for LC-MS/MS. The contribution of contaminating cardiomyocytes was considered negligible in the preparations, as no peptides were identified from major cardiomyocyte proteins such as

ACS Paragon Plus Environment

7

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

natriuretic peptides A(ANF) and myosin heavy chain polypeptides 6 and 7. Furthermore, no signals with m/z values corresponding to the 28-aa intact ANF-derived atrial natriuretic peptide were found on the spectra. A representative base peak chromatogram of one of the four preparations indicated that thymosins and their oxidized forms were dominant in the conditioned medium (Figure 1). Most of the sequences detected as major peaks were from proteins with a signal peptide (Table 1). Thymosins are non-classical secretory peptides lacking signal peptides. The mesenchymal marker vimentin (VIME) is mostly considered an intracellular protein but its secretion and biological consequences have been described.16The identification of peptides derived from extracellular matrices (ECMs) as well as ADM (ADML) and VIME is consistent with the analyte being of fibroblast origin. The most important technical challenge in peptidomics is identifying peptides larger than 3,000 Da. The difficulty in identifying larger peptides at least in part stems from the inability of data-dependent programs to trigger MS/MS on monoisotopic ions for such peptides. Those monoisotopic ions are not presented in dominant signals in an isotope envelope when detected as multiply charged ions.Instead, we selected the most intense peak in an isotope envelope for MS/MS. To avoid performing MS/MS of peaks adjacent to the top peak within the same envelope, we applied an exclusion width of 1.2 m/z. Since this MS/MS setting does not provide precursor monoisotopic m/z values, we used the software Mascot Distiller to identify monoisotopic m/z values as well as to make deconvolution of MS/MS spectra, which shows a mixture of fragment ions with different charge states. These improved parameters advanced identification of relatively large peptides, with nearly 50% of unique sequences being larger than 3000 Da(Figure 2).

Using Peptide Sequence Reads to Identify Limited Proteolysis Sites

ACS Paragon Plus Environment

8

Page 9 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Redundant 3,916 sequences arising from 94 proteins, obtained through LC-MS/MS of four biological replicates, were used to review extracellular cleavage sites of the identified proteins (Supplementary Tables 1 and 2). For this analysis, peptides with oxidized Met, hydroxylated Pro, or hydroxylated Lys were excluded because in most cases unmodified peptides were sequenced as well. UniProt entries are still limited, with 36 proteins found only in NCBI. It was found that 84 proteins are annotated to have signal peptides in UniProt or NCBI entry, of which only cathepsin L1 (CATL1) has an experimentally verified signal peptide-propeptide boundary.17 Four type II membrane proteins, integral membrane protein 2B (ITM2B), ITM2C, tumor necrosis factor ligand superfamily member 12 (TNF12), and xylosyltransferase 1 (XYLT1),were represented by a set of peptides attributed to their extracellular domains. Peptide sequences sharing cleavage sites at either or both termini were obtained across different culture preparations (Figures 3 and 4A). Aligned to parent protein sequences, they marked biologically relevant limited cleavage sites. We were thus able to confirm established cleavage sites for signal peptide or propeptide removal, peptide hormone processing, ectodomain shedding and regulated intramembrane proteolysis (Figures 3 to 6). Of note, our data also revealed incorrectly annotated or previously uncharacterized sites associated with limited proteolysis.

Signal Peptide Cleavage Sites The N-terminus of a mature protein is in most cases not experimentally determined. In UniProt, signal peptides are hence annotated as “potential” (manual assertion according to rules) or “by similarity” (manual assertion inferred from sequence similarity). Hereafter we refer to “potential” and “by similarity” annotations as “predicted”. Repeated identification of

ACS Paragon Plus Environment

9

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 38

the same peptide from different preparations and different LC-MS/MS runs elucidated the boundary between the signal peptide and either the propeptide or mature protein N-terminus for some proteins (Table 2). In some cases, we found the cleavage sites different from those predicted, but they also fit the substrate specificity formulated as the (-3, -1) rule for eukaryotic signal peptidase.18,19

ECM proteins - Signal peptide cleavage sites predicted for elastin (ELN), decorin (PGS2) and extracellular matrix protein 1 (ECM1) were confirmed by the peptide alignment maps (Supplementary Figure 1). ELN is the precursor from which the greatest number of redundant sequences (800 reads) were obtained (Table 2). A total of 49 reads all started with Gly-28, consistent with the annotated signal peptide being 27 aa long. Fewer sequences were attributed to PGS2 and ECM1 but they also marked the previously annotated sites. In contrast, we noted that signal peptides have been incorrectly annotated for collagen alpha-2 (I) chain (CO1A2), fibronectin (FINC), fibrillin-1 (FBN1), biglycan (PGS1), and coiled-coil domain-containing protein 80 (CCD80).In UniProt entry CO1A2_RAT, the predicted N-terminus is Leu-25. Instead, the 139 identified sequences show that its Nterminus started at Gln-23, two aa upstream of the annotated site (Figure 3). The reason Replicates 2 and 3 saw larger peptides relative to the others could be ascribed to the observation that more sequence reads were obtained from the replicates (Table 2). In the endogenous peptide preparation, larger peptides resulted in lower signals and were more difficult to select in a data-dependent LC-MS/MS mode. The five sequences located at the Nterminal part of the signal peptide, which indicates the release of immature collagen polypeptides because of limited cell deterioration during serum-free culture. Taken together, we concluded that the authentic signal peptide is 22 aa long.

ACS Paragon Plus Environment

10

Page 11 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In UniProt entry FINC_RAT, a 32-aa peptide is given as a de facto signal peptide with no specific reference. Our data, however, indicate that the N-terminus started at Thr-25, eight aa upstream of the annotated site (Supplementary Figure 1). A conflict was also noted with FBN1, for which a 27-aa peptide is predicted by SignalP (http://www.cbs.dtu.dk/services/SignalP/). Our data indicate a shorter signal peptide of 24 aa, which were demarcated by the 17 sequence reads starting at Ala-25. Unlike CO1A2 or FINC, PGS1 was represented only by N-terminal peptides (Supplementary Figure 1). In total, 13 of 148 obtained sequences started with Glu-20, being consistent with the putative signal peptide being 19 aa long (Table 2). However, 81 of the 148 sequences started with Leu-17, with those starting with Phe-19 amounting to 31 sequences. These data suggest the heterogeneity in signal peptide cleavage but the major cleavage occurs between Ala-16 and Leu-17. The C-terminal cleavage site of the PGS1 propeptide, elucidated by the peptide map, supports the previous prediction that it would end with Asn-37.20The CCD80 signal peptide also appearsto be heterogeneous. In our data, 140 of the 348 identified sequences started with Leu-28, which is in agreement with the UniProt annotation. In support of this observation, five different sequences sharing the N-terminal leucine represented distinct peaks (Table 1). However, appreciable numbers of sequence reads suggest upstream N-termini; 71 sequences started with Ser-23 and 64 with Gln-24.

Secreted or membrane-associated enzymes - Signal peptide cleavage sites predicted for 72 kDa type IV collagenase (MMP2), disintegrin and metalloprotease domain-containing protein 17 (ADA17), ADA9, proprotein convertase subtilisin/kexin type 6 (PCSK6), CATL1, protein-lysine 6-oxidase (LYOX), lysyl oxidase-like 1 (LOXL1), and N-acetylglucosamine-1phosphodiester alpha-N-acetylglucosaminidase (NAGPA) coincided with our profile data

ACS Paragon Plus Environment

11

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

(Table 2 and Supplementary Figure 1). Apart from ADA9, these enzymes have the propeptide region adjacent to the signal peptide. In the present study, the predicted boundary was marked and confirmed by identified sequences for them. The signal peptide of ADA10 predicted by SignalP is 18 aalong. In our peptide profile data, however, only 2 of the 38 identified sequences started with Gly-19; 19 of 38 started with Gln-20, which underwent pyroglutamylation.In either case, the identification of ADA10 and ADA17 propeptides indicate that the principal sheddaseswere activated in the system.21 Apart from signal peptide cleavage, previously unannotated LOXL1[26-95] and NAGPA[25-48] were considered bona fide propeptides on the basis of corresponding annotations for human and mouse homologs (Supplementary Figure 1). Also noteworthy is the observation that the cleavage between Asn66 and Leu-67 was markedly identified for MMP2. This cleavage is created by MMP14, being the initial step for MMP2 activation.22 MMP 14 was not represented by sequences deemed adjacent to the signal peptide, but sequences within the propeptide region were obtained (Supplementary Table 1 and Supplementary Figure 2). A disintegrin and metalloproteinase with thrombospondin motifs 2 (ATS2) is annotated only in NCBI (gi|212549639), in which a 29-aa signal peptide is predicted. However, the actual N-terminus of the propeptide seemed to be demarcated by the 26 sequences from three different samples that all started with Thr-43 (Supplementary Figure 1). None of them were N-terminally extended, suggesting that its signal peptide is 42 aa long. The signal peptide annotation for ATS5 (gi|38454260), predicted by SignalP, also conflicted with the sequence data.

Type I membrane proteins - Signal peptide annotations for cadherin 3 (CADH3), FXYD domain-containing ion transport regulator 6 (FXYD6), plexin domain-containing protein 2

ACS Paragon Plus Environment

12

Page 13 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(PXDC2), and sortilin (SORT) were identical to those highlighted by our data (Table 2 and Supplementary Figure 1). In the heart, CAD11 is a cell adhesion molecule specifically expressed in fibroblasts.23 Annotated entries are found only in NCBI (gi|281427192) but with no description of signal peptide or propeptide. Meanwhile, our data suggest that the signal peptide is cleaved between Ala-22 and Phe-23, which is supported by SignaIP. In addition, all the sequences ended with Ser-51, which corresponds to annotated mouse CAD11 propeptide. Since this cleavage site is C-terminally flanked by Lys-52 and Arg-53, rat CAD11[23-51] is considered a bona fide propeptide. However, our findings with P-selectin glycoprotein ligand 1 (SELPL) and tumor necrosis factor receptor superfamily member 6 (TNR6) were not in agreement with the database annotation(Supplementary Figure 1). Highlighted by the identification of SELPL[19-39], the authentic signal peptide seemed to be one aa shorter than that predicted by SignalP. As with other type I membrane proteins, its C-terminus fits the consensus motif of furin-like proteases such as PCSK5, PCSK6 or furin.24 TNR6, commonly known as FAS receptor, is a cell surface death receptor linking to one of the two major apoptosis pathways. Its signal peptide seems to be incorrectly annotated in UniProt (gi|6015131) and NCBI (gi|59624977), where the predicted signal peptide is 21 aa long. However, this unbiased study constantly retrievedTNR6[14-37] in three biological replicates, being consistent with the prediction by SignalP (Supplementary Figure 1).

Secreted proteins - Signal peptide cleavage sites annotated for ADM2, augurin (AUGN), endothelin-1 (EDN1), latent-transforming growth factor beta-binding protein 2 (LTBP2), and fibronectin type III domain-containing protein 1 (FNDC1) were consistent with our sequence data (Table 2 and Supplementary Figure 1). As for angiopoietin-related protein 2 (ANGL2)

ACS Paragon Plus Environment

13

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 38

(gi|149038973),a 20-aa signal peptide is suggested by SignalP, but 30 of the 52 sequences started with Ala-20, and 19 sequences with Thr-21.

Peptide hormone precursor processing- As inferred by homology to human ADML, the rat precursor is assumed to undergo limited proteolysis to four major processing products, two of which are biologically active peptides designated proadrenomedullin N-terminal peptide (PAMP) and adrenomedullin.25Aside from50-aa C-terminally amidated adrenomedullin26, no endogenous peptides were biochemically characterized from the rat precursor. Despite adrenomedullin being a major peptide secreted by cardiac fibroblasts,25 ADML peptides escaped a previous secretome study of mouse cardiac fibroblasts.27In the present study, a total of 139 redundant sequences were obtained and aligned to the precursor (Fig. 3A). The peptide map shows a processing pattern similar to human ADML.Intact adrenomedullin(50 aa, 5726 Da) has a single intramolecular disulfide bond, but was verified by MS/MS in its intact form (Supplementary Figure 4). The signal cleavage site demarcated by the actual PAMP sequences was different from that described inUniProt entry ADML_RAT (Figure 4A). Because of sequence homology to the human N-terminal sequence,rat PAMPhas been assumed to be a 20-aa Cterminally amidated peptide without being biochemically characterized. In our analysis, however, it was invariably identified as either 22-aa or 23-aa peptide. To see if our parameter setting failed to trigger MS/MS for the 20-aa peptide, we extracted mass chromatograms for this peptide but m/z values corresponding to the 20-aa form were not detected in the spectra (data not shown). To determine the major molecular form of rat PAMP, we immunoprecipitated PAMP-related peptides from the spent medium of cardiac fibroblasts using antibody raised against C-terminal PAMP (Figure 4B). A dominant peak appeared on

ACS Paragon Plus Environment

14

Page 15 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the MALDI spectra, consistent with the theoretical mass of 23-aa PAMP being 2763.468 Da. The peak was accompanied by a minor peak corresponding to 22-aa PAMP. We thus concluded that the major form of rat PAMP is a 23-aa C-terminally amidated peptide.The signal peptide-PAMP boundary is supported by SignalP (data not shown). The current UniProt entry has not incorporated the prediction, probably influenced by the peptide name “proadrenomedullin N-20 terminal peptide”, which was originally isolated in humans and so named25. EDN1 is another bioactive peptide precursor that was not described in the cardiac fibroblast secretome study.27 A total of 20 sequences attributed to the precursor were mapped, highlighting known processing sites for this precursor (Supplementary Figure 1). Notably, big endothelin-1 was sequenced, but intact endothelin-1 was not found in MS full scans with no m/z values equivalent to the vasoconstrictive hormone. The absence of the peptide signal could be explained by the peptide’s low ionization efficiency, as it has only a single basic residue for a 21-aa peptide. Regarding ADM2, our identification of 22 sequences starting with Gly-26 is consistent with the predicted signal peptide being 25 aa long (Supplementary Figure 1). Furthermore, the C-terminal cleavage between Val-56 and Val-57 was constantly noted, with no truncated or extended forms identified.This cleavage seemed specific but atypical, considering that major processing sites of peptide hormone precursors fit the consensus motif shared by furin-like proteases, as in ADML and EDN1 peptides.24

Membrane Stubs Membrane-anchored stubs of membrane proteins are created as a result of sequential cleavages in the extracellular juxtamembrane region and its intramembrane region. In the present study, membrane stubs were identified for the type I membrane proteins amyloid beta

ACS Paragon Plus Environment

15

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

A4 protein (A4), neurogenic locus notch homolog protein 2 (NOTC2), NOTC3, syndecan-1 (SDC1), SDC2, macrophage colony-stimulating factor 1 (CSF1), cysteine-rich motor neuron 1 protein (CRIM1), vasorin (VASN), matrix-remodeling-associated protein 8 (MXRA8), FXYD6, and XYLT1, a type II membrane protein (Figure 6). Ectodomain shedding and intramembrane proteolysis have been most extensively studied on A4, NOTC2 and NOTC3.28,29 The C-terminus of Abeta40, Val-711, and the cleavage between Phe-690 and Phe-691, the well-characterized cleavage site by the beta2-secretase,28 were demarcated by the identified sequences(Supplementary Figure 1). This regulated proteolysis has also been described for the Notch family members. In UniProt, Notch2 ectodomain is predicted to be cleaved between Ser-1665 and Val-1666. However, our data indicate that the actual cleavage site is three aaupstream of the annotated site. As for intramembrane proteolysis it is predicted to be C-terminal to Gly-1696, which is in the putative transmembrane domain encompassing residues 1678 to 1698. Our peptide data suggest that rat NOTC2 is cleaved C-terminally to Leu-1681 or Leu-1682. Similar findings were obtained with NOTC3, another NOTCH family member, for which all the three sequences indicated that ectodomain cleavage occurs Cterminal to Pro-1627, three aa upstream of the annotated site. Our data suggest that the intramembrane proteolysis occurs at C-terminal to Ala-1651 or Gly-1652. The cell adhesion molecule SDC1 is predicted to undergo ectodomain shedding, by analogy to mouse SDC1, whose ectodomain cleavage site has been determined using mass spectrometry.30The cleavage between Lys-253 and Glu-254 is found in SDC1_RAT. Contrary to this annotation, all the ten sequences were SDC1[246-276] (Supplementary Figure 1). The N-terminal Ser-246 turned out to be identical to the experimentally validated mouse SDC1 cleavage site.31 Taken together, we concluded that ectodomain shedding and intramembrane proteolysis were defined by the SDC1 membrane stub. As with SDC1, mouse SDC2 is known for its ectodomain shedding but with no specific cleavage site revealed.32 In

ACS Paragon Plus Environment

16

Page 17 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

UniProt,rat SDC2 ectodomain cleavage occurs C-terminally to Arg-142, a potential cleavage site for furin-like enzymes24. However, our rat data suggest that it occurs C-terminally to Lys134 or less frequently C-terminally to Asp-129, indicating that the major ectodomaincleavage site is more distal to the cell membrane. The membrane-anchored cytokine CSF1 has been known for its ectodomain shedding, but the exact cleavage site was not unequivocally determined.33 Our data show that major ectodomain cleavage sites are C-terminal to Arg-442 and Arg-468 (Supplementary Figure 1). Both of them fit the consensus motif for furin-like proprotein convertases. Repeatedly identified CSF1[469-511] and [469-515]were considered major membrane stubs. It should be noted that CSF1[469-511] emerged at 193.21 min as a significant peak in the chromatogram (Figure 1 and Table 1). Ectodomain shedding has also been described for human CRIM1, but the cleavage site remains unidentified.34 According to TMHMM (http://www.cbs.dtu.dk/services/TMHMM/), a transmembrane domain prediction program, rat CRIM1 (gi|281182663) transmembrane region spans Ser-940 to Asn-962. All the 14 sequence reads from three replicates started with Leu-920 and ended with either Ile-949 or Ile-950. The N-terminal Leu-920 is in the juxtamembrane region, located downstream of the structured sixth von Willebrand factor type C domain. Membrane stubs were identified for VASN, MXRA8, FXYD6, and XYLT1 but to our knowledge none of them has been described to undergo ectodomain shedding. With regard to VASN, an inhibitor of TGF-beta signaling35, the boundary between Pro-566 and Val-567 was highlighted (Supplementary Figure 1). The type I membrane protein MXRA8 is implicated in maintaining the blood brain barrier36, but is not biochemically studied. All the 24 sequences strongly supportthat C-terminal to Ser-324 is a critical ectodomain cleavage site and cleavages C-terminal to Thr-347 or Leu-348 would be intramembrane proteolysis sites. A

ACS Paragon Plus Environment

17

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 38

couple of short sequences similar to these MXRA8 and VASN cleavage sites are annotated in MEROPS (http://merops.sanger.ac.uk/) as MMP2 target sites. Considering the role of MMP2 as a sheddase, they might be potential substrates.2 TMHMM suggests that XYLT1 is a type II membrane protein with a transmembrane domain spanning Arg-13 to Phe-35. The 18 identified sequences all ended with Glu-68. The identification of XYLT1[30-68], the Nterminal part of which is considered to be embedded in the membrane, suggests that this protein undergoes regulated intramembrane proteolysis. This stub is C-terminally flanked by a pair of arginine residues, suggesting a furin-mediated cleavage as demonstrated in ITM2Bderived Bri23 peptide.37

Other Extracellular Cleavages Calsyntenin-1 (CSTN1) undergoes ectodomain shedding to liberate soluble alc-alpha in humans.38,39 Because of sequence homology, rat CSTN1 is predicted to be cleaved between Met-796 and Ala-797. However, that may not be correctly annotated (Supplementary Figure 1). In our study, 34 of the 55 identified reads were attributed to CSTN1[796-824], which turned out to be a peak in the base peak chromatogram (Figure 1). The cleavage between His-795 and Met-796 was therefore highlighted, consistent with human CSTN1 being cleaved between His-824 and Met-825 by ADA10 or ADA17.39 Since the residues from position -4 to +4 are identical between human and rat, the observed cleavage site is likely to be cleaved by the sheddases. On the other side, nearly all the sequences (54 of 55) highlighted the cleavage between Phe-824 and Ala-825, which, to our knowledge, was not previously described. Overall, the two distinct sites may point to a regulatory mechanism for CSTN1 (Figure 5).

ACS Paragon Plus Environment

18

Page 19 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The type II membrane proteins ITM2B and ITM2C inhibit A4 processing by preventing its access to alpha and beta secretases in humans.40 As for ITM2B, the major peptide form was a 23-aa C-terminal peptide that is N-terminally cleaved at a dibasic furin consensus site.37 ITM2C has a similar consensus site, releasing a 25-aa C-terminal peptide (Figure 5). Nuclease-sensitive element-binding protein 1 (YBOX1) has no apparent signal peptide but is secreted and acts as an extracellular mitogen in humans.41 In the present study, a C-terminal 41-aa peptide, YBOX1[282-322], was constantly obtained. The peptide is Nterminally flanked by 277RRYRR281, which is probably recognized by furin-like proteases. The occurrence of four consecutive arginines (286RRRR289) in the peptide typically suggests that endogenous peptides be analyzed in their intact form. Furthermore, we reproducibly found two distinctive peptides derived from putative type I membrane proteins with no apparent signal peptide (Supplementary Figure 1F). One was the N-terminal 19-aa peptide from small integral membrane protein 3 (SMIM3). TMHMM predicts that its N-terminal 19 aa is located outside the cell membrane, followed by a transmembrane region. The presence of this peptide was consistent with the predicted protein topology. Likewise, an N-terminal 19-aa peptide lacking the initial Met was attributed to transmembrane gamma-carboxyglutamic acid protein 1 (TMG1). The rat precursor is found only in NCBI (gi|564399277) but with no domain annotation. Despite no experimental evidence, human TMG1[1-20] is predicted to be the propeptide region in Uniprot. Since the N-terminal 20-aa peptide is identical between rat and human except for the residue 14, the flanking residue Ala-21 would be an authentic N-terminus of the protein.

Post-translationally Modified Sites

ACS Paragon Plus Environment

19

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 38

In the present study, analytes were not enriched for phosphopeptides, but previously undescribed phosphorylation sites were identified for ADML and TMG4 (Supplementary Figure 3). ADML[150-185], one of the major processing products, occurred as both nonphosphorylated and phosphorylated forms (Figure 3A). With Mascot Modification site analysis, a probability of Ser-163 being phosphorylated was 84% and that for Ser-162 was 13% at best. Because of limited fragmentation in a 36-aa peptide with seven Ser and two Thr residues, CID did not reachdefinite conclusions. Alternatively, a series of fragment ions were produced by ETD, leading to a higher probability value of 97.6% for Ser-163 (Supplementary Figure 3). The prediction software NetPhosK (http://www.cbs.dtu.dk/services/NetPhos/) reported Ser-163 but not Ser-162 as a possible phosphorylation site by DNA-dependent kinase. The single-pass type I membrane protein TMG4 was found to be phosphorylated at Ser-38, which is not previously annotated or predicted by NetPhosK. This modification was demonstrated by TMG4[24-51]. Further study would be required for characterizing the processing of this precursor and the significance of this phosphorylation. In addition, we found that Lys-22 of PGS1 was hydroxylated (Supplementary Figure 3). Lysine hydroxylation was also noted in Lys-474 of CSF1, as demonstrated by MS/MS of CSF1[443466]. The MS/MS spectra excluded the possibility of methionine oxidation or proline hydroxylation.

Disulfide-Containing Peptides In peptide MS/MS, the sequence enclosed by a disulfide loop resists conventional CID fragmentation. Reductive alkylation of cysteine residues was left out in three of the four biological replicates to preserve natural configuration of disulfide linkages. The introduction of Cys-H as a variable modification in Mascot MS/MS search parameters proved effective in

ACS Paragon Plus Environment

20

Page 21 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

some if not all of previously described disulfide bond-containing peptides. Some of the notable examples are the cardiac regulatory peptides adrenomedullin and big endothelin-1 (Supplementary Figure 4). Significant Mascot scores were obtained with these peptides because the extensions outside of the paired cysteines underwent fragmentation to yield a series of product ions. Successful identification thus depends on the length and composition of outside residues that are not restricted by disulfide bonds. Such peptides are summarized in Supplementary Table 3. The peptides derived from the ECMs ELN and MGP are among the examples in which specific disulfide bonds are inferred by similarity to other species’orthologs. According to PROSITE Pro-Rule annotation inUniProt, Cys-73 and Cys79 of MGP are disulfide-linked. This annotation was supported by the identification of five different sequences including the two cysteines. As in MGP, the two cysteines located at the C-terminal end of ELN have been predicted to form a disulfide bond, which was confirmed by the identification of a series of N-terminally truncated peptides with this disulfide bond. In addition, previously undescribed disulfide bonds were elucidated. A notable example came from FINC (Supplementary Table 3). While no reference is given, it has been annotated in UniProt entry FINC_RAT that Cys-2458 in one polypeptide chain forms an interchain disulfide bond with Cys-2462 in another chain, and vice versa. However, our data about FINC-derived endogenous peptides strongly suggest that the two cysteines form an intramolecular disulfide linkage. This was clearly demonstrated by the identification of FINC[2447-2470], RYHQRTNTNVNCPIECFMPLDVQA (Supplementary Figure 4). The peak corresponding to b18 ion dominated the MS/MS spectrum, while adjacent b19 ion was absent. This observation is explained by the preferential fragmentation N-terminal to proline and frequent lack of fragmentation at C-terminal to the residue. The proline effect was also noted in two C-terminally extended forms of this circular peptide (Supplementary Table 3).

ACS Paragon Plus Environment

21

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 38

Both N-terminal and C-terminal extensions were long enough to yield a set of consecutive product ions for identification. In endogenous peptide profiling, peptide sequences are identified in ladders of peptide fragments caused by N-terminal or C-terminal truncation. This is inevitable and even a harvest after a short incubation time of two minutes saw peptide sequence ladders.6 Since regular cells require a longer contact with medium for profiling experiments, most secretome studies have adopted a12 to 24 hour incubation time and often used a mixture of protease inhibitors.42,43In the present study, however, we harvested medium after a four-hour incubation without adding exogenous inhibitors to grasp a natural peptide landscape. In addition, the improved strategy for identifying relatively large endogenous peptides (> 3000 Da) may benefit peptidomics studies of proteolytic processing.44,45

CONCLUSION We demonstrated for the first time that endogenous peptide profiling ofregularcell types helps to identify biologically relevant cleavage sites associated with limited proteolysis. Sequence reads serve to highlight cleavage sites for limited proteolysis of secretory proteins and membrane proteins in an unbiased manner. In-depth analysis will further uncover relevant cleavage sites for previously uncharacterized proteins. Our peptidomics study has the potential to address an important question in proteomics that is beyond the reach of current bottom-up proteomics or degradomics.

ACS Paragon Plus Environment

22

Page 23 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Supporting Information Figure S1, Peptide alignment maps. Figure S2, Processing enzymes represented by part ofpropeptide regions. Figure S3,MS/MS spectra of post-translationally modified peptides shown in the text. Figure S4,MS/MS spectra of disulfide-containing peptides from ADML, EDN1 and FINC peptides analyzed in intact form. Table S1, 3916 redundant sequences identified by LC-MS/MS. Table S2, Precursor sequence entries listed in Table S1, Table S3, Disulfide-containing peptides identified in their intact form. This material is available free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author *Kazuki Sasaki, Department of Molecular Pharmacology, National Cerebral and Cardiovascular Research Center, 5-7-1 Fujishirodai, Suita, Osaka 565-8565, Japan. Phone: +81 6 6833 5012 ext. 2600. [email protected] Present Address †Department of Molecular Patho-Biochemistry and Patho-Biology, Yamagata University School of Medicine, 2-2-2 Iida-Nishi, Yamagata990-9585, Japan ACKNOWLEDGMENT We thank Masako Matsubara (National Cerebral and Cardiovascular Center) for creating peptide maps.This study was supported in part by JSPS KAKENHI Grant Number25461402, andthe Advanced Research for Medical Products MiningProgram of the National Institute of Biomedical Innovation (NIBIO), Japan.

ACS Paragon Plus Environment

23

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 38

ABBREVIATIONS aa, amino acid(s); ACN, acetonitrile; CID, collision-induced dissociation; ECM, extracellular matrix; ETD, electron transfer dissociation; MALDI, matrix-assisted laser desorption ionization; MS/MS, tandem mass spectrometry. Protein name abbreviations are based on UniProt entries.

REFERENCES 1. Neurath, H. Proteolytic processing and physiological regulation. Trends Biochem. Sci.1989, 14, 268-271. 2. Page-McCaw, A.; Ewald, A.J.;Werb, Z. Matrix metalloproteinases and the regulation of tissue remodelling. Nat. Rev. Mol. Cell. Biol. 2007, 8, 221-233. 3. Lal, M.; Caplan, M. Regulated intramembrane proteolysis: signaling pathways andbiological functions. Physiology (Bethesda)2011, 26, 34-44. 4. Rogers, L.D.; Overall, C.M. Proteolytic post-translational modification of proteins: proteomic tools and methodology. Mol. Cell. Proteomics 2013, 12, 3532-3542. 5. Impens, F.;Colaert, N.;Helsens, K.;Plasman, K.; Van Damme, P.;Vandekerckhove, J.; Gevaert, K. MS-driven protease substrate degradomics. Proteomics2010, 6, 1284-1296. 6. Sasaki, K.; Satomi, Y.; Takao, T.;Minamino, N. Snapshot peptidomics of the regulated secretory pathway. Mol. Cell. Proteomics2009, 8, 1638-1647. 7. Yamaguchi, H.; Sasaki, K.; Satomi, Y.;Shimbara, T.;Kageyama, H.;Mondal, M.S.; Toshinai, K.; Date, Y.; González, L.J.; Shioda, S.; Takao, T.; Nakazato, M.;

ACS Paragon Plus Environment

24

Page 25 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Minamino,N. Peptidomicidentification and biological validation ofneuroendocrine regulatory peptide-1and -2. J. Biol. Chem.2007, 282, 26354-26360. 8. Sasaki.; K.; Takahashi N.; Satoh M.; Yamasaki M.; Minamino N. A peptidomics strategyfor discovering endogenous bioactive peptides.J. Proteome Res. 2010, 9, 50475052. 9. Osaki,

T.;

Sasaki,

K.;Minamino,

N.

Peptidomics-based

discovery

of

an

antimicrobialpeptide derived from insulin-like growth factor-binding protein 5. J. ProteomeRes.2011, 10, 1870-1880. 10. Sasaki, K.; Osaki, T.;Minamino N. Large-scale identification of endogenoussecretory peptides using electron transfer dissociation mass spectrometry. Mol. Cell. Proteomics2013,12,700-709. 11. Suga, S.; Nakao, K.; Itoh, H.; Komatsu, Y.; Ogawa, Y.; Hama, N.; Imura, H. Endothelial production of C-type natriuretic peptide and its marked augmentation by transforming growth factor-beta. Possible existence of "vascular natriuretic peptide system". J. Clin. Invest.1992, 90, 1145-1149. 12. Yanagisawa, M.; Kurihara, H.; Kimura, S.; Tomobe, Y.; Kobayashi, M.; Mitsui, Y.; Yazaki, Y.; Goto, K.; Masaki, T. A novel potent vasoconstrictor peptide produced by vascular endothelial cells. Nature1988, 332, 411-415. 13. Yin, P.; Knolhoff, A.M.; Rosenberg, H.J.; Millet, L.J.; Gillette, M.U.; Sweedler, J.V. Peptidomic analyses of mouse astrocytic cell lines and rat primary cultured astrocytes. J. Proteome Res. 2012, 11, 3965-3973.

ACS Paragon Plus Environment

25

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 38

14. Fricker, L.D.; Gelman, J.S.; Castro, L.M.; Gozzo, F.C.; Ferro, E.S. Peptidomic analysis of HEK293T cells: effect of the proteasome inhibitor epoxomicin on intracellular peptides. J. Proteome Res. 2012, 11, 1981-1990. 15. Tokudome, T.; Horio, T.; Fukunaga, M.; Okumura, H.; Hino, J.; Mori, K.; Yoshihara, F.; Suga, S.; Kawano, Y.; Kohno, M.; Kangawa, K. Ventricular nonmyocytes inhibit doxorubicin-induced myocyte apoptosis: involvement of endogenous endothelin-1 as a paracrine factor. Endocrinology2004, 145,2458-2466. 16. Shigyo, M.; Kuboyama, T.; Sawai,Y.; Tada-Umezaki, M.; Tohda, C. Extracellular vimentininteracts with insulin-like growth factor 1 receptor to promote axonal growth.Sci. Rep.2015,5, 12055. 17. Boujrad, N.; Ogwuegbu, S.O.; Garnier, M.; Lee, C.H.; Martin, B.M.; Papadopoulos, V.Identification of a stimulator of steroid hormone synthesis isolated from testis.Science1995, 268, 1609-1612. 18. Folz, R.J.; Nothwehr, S.F.; Gordon, J.I. Substrate specificity of eukaryotic signalpeptidase. Site-saturation mutagenesis at position -1 regulates cleavage between multiple sites in human pre (delta pro) apolipoprotein A-II. J. Biol. Chem. 1988, 263, 2070-2078. 19. vonHeijne, G. Patterns of amino acids near signal-sequence cleavage sites. Eur. J.Biochem.1983, 133, 17-21. 20. Scott, I.C.; Imamura, Y.; Pappano, W.N.; Troedel, J.M.; Recklies, A.D.; Roughley, P.J.; Greenspan, D.S. Bone morphogenetic protein-1 processes probiglycan.J. Biol. Chem.2000, 275, 30504-30511.

ACS Paragon Plus Environment

26

Page 27 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

21. Arribas, J.; Borroto, A. Protein ectodomain shedding. Chem. Rev. 2002, 102, 46274638. 22. Sato, H.; Takino, T.; Okada, Y.; Cao, J.; Shinagawa, A.; Yamamoto, E.; Seiki, M. A matrixmetalloproteinase expressed on the surface of invasive tumour cells. Nature1994, 370, 61-65. 23. Camelliti, P.; Borg, T.K.; Kohl, P. Structural and functional characterisation of cardiac fibroblasts. Cardiovasc. Res.2005, 65, 40-51. 24. Nakayama, K. Furin: a mammalian subtilisin/Kex2p-like endoprotease involved in processing of a wide variety of precursor proteins. Biochem. J. 1997, 327, 625-635. 25. Kitamura, K.; Sakata, J.; Kangawa, K.; Kojima, M; Matsuo, H.;Eto, T. Cloning andcharacterization of cDNA encoding a precursor for human adrenomedullin. Biochem. Biophys. Res.Commun.1993, 194, 720-725. 26. Sakata, J.; Shimokubo, T.; Kitamura, K.; Nakamura, S.; Kangawa, K.; Matsuo, H.; Eto, T. Molecular cloning and biological activities of rat adrenomedullin.; a hypotensive peptide. Biochem.Biophys. Res.Commun.1993, 195,921-927. 27. Abonnenc, M.; Nabeebaccus, A.A.; Mayr, U.; Barallobre-Barreiro, J.; Dong, X.; Cuello, F.; Sur, S.; Drozdov, I.; Langley, S.R.; Lu, R.; Stathopoulou, K.; Didangelos, A.; Yin, X.; Zimmermann, W.H.; Shah, A.M.; Zampetaki, A.; Mayr, M. Extracellular matrix secretion by cardiac fibroblasts: role of microRNA-29b and microRNA-30c. Circ. Res.2013,113, 1138-1147. 28. Gu, Y.; Misonou, H.; Sato, T.; Dohmae, N.; Takio, K.; Ihara, Y. Distinct intramembranecleavage

of

the

beta-amyloid

precursor

protein

family

ACS Paragon Plus Environment

27

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 38

resemblinggamma-secretase-like cleavage of Notch. J. Biol. Chem. 2001, 276, 3523535238. 29. Xia, W.; Wolfe, M.S. Intramembrane proteolysis by presenilin and presenilinlikeproteases. J. Cell Sci.2003, 116, 2839-2844. 30. Fitzgerald, M.L.; Wang, Z.; Park, P.W.; Murphy, G.; Bernfield, M. Shedding of syndecan-1 and -4 ectodomains is regulated by multiple signaling pathways and mediated by a TIMP-3-sensitive metalloproteinase. J. Cell. Biol. 2000, 148, 811-824. 31. Wang, Z.; Gotte, M.; Bernfield, M.; Reizes, O.Constitutive and accelerated shedding of murine syndecan-1 is mediated by cleavage of its core protein at a specific juxtamembrane site.Biochemistry2005, 44,12355-12361. 32. Fears, C.Y.; Gladson, C.L.; Woods, A. Syndecan-2 is expressed in the microvasculature of gliomas and regulates angiogenic processes in microvascular endothelial cells. J. Biol. Chem. 2006, 281, 14533-14536. 33. Horiuchi, K.; Miyamoto, T.; Takaishi, H.; Hakozaki, A.; Kosaki, N.; Miyauchi, Y.; Furukawa, M.; Takito, J.; Kaneko, H.; Matsuzaki, K.; Morioka, H.; Blobel, CP.; Toyama, Y. Cell surface colony-stimulating factor 1 can be cleaved by TNF-alpha converting enzyme orendocytosed in a clathrin-dependent manner. J.Immunol.2007, 179, 6715-6724. 34. Wilkinson, L.; Kolle, G.; Wen, D.; Piper, M.; Scott, J.; Little, M. CRIM1 regulates the rate of processing and delivery of bone morphogenetic proteins to the cell surface. J. Biol. Chem. 2003, 278, 34181-34188. 35. Ikeda, Y.; Imai, Y.; Kumagai, H.; Nosaka, T.; Morikawa, Y.; Hisaoka, T.; Manabe, I.; Maemura, K.; Nakaoka, T.; Imamura, T.; Miyazono, K.; Komuro, I.; Nagai, R.;

ACS Paragon Plus Environment

28

Page 29 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Kitamura, T. Vasorin.; atransforming growth factor beta-binding protein expressed in vascular smoothmuscle cells.; modulates the arterial response to injury in vivo. Proc. Natl. Acad. Sci. U S A. 2004, 101, 10732-10737. 36. Yonezawa, T.; Ohtsuka, A.; Yoshitaka, T.; Hirano, S.; Nomoto, H.; Yamamoto, K.; Ninomiya, Y. Limitrin, a novel immunoglobulin superfamily protein localized to glia limitans formed by astrocyte endfeet. Glia2003, 44,190-204. 37. Kim, S.H.; Wang, R.; Gordon, D.J.; Bass, J.; Steiner, D.F.; Lynn, D.G.; Thinakaran, G.; Meredith, S.C.; Sisodia, S.S. Furin mediates enhanced production of fibrillogenicABri peptides in familial British dementia. Nat.Neurosci. 1999, 2, 984-988. 38. Araki, Y.; Miyagi N.; Kato, N.; Yoshida, T.; Wada, S.; Nishimura, M.; Komano, H.; Yamamoto, T.; De Strooper, B.; Yamamoto, K.; Suzuki, T. Coordinated metabolism of Alcadein and amyloid beta-protein precursor regulates FE65-dependent gene transactivation. J. Biol. Chem. 2004, 279, 24343-24354. 39. Maruta, C.; Saito, Y.; Hata, S.; Gotoh, N.; Suzuki, T.; Yamamoto, T. Constitutivecleavage of the single-pass transmembrane protein alcadeinα prevents aberrantperipheral retention of Kinesin-1. PLoS One2012, 7, e43058. 40. Matsuda, S.; Giliberto, L.; Matsuda, Y.; Davies, P.; McGowan, E.; Pickford, F.; Ghiso, J.;Frangione, B.; D'Adamio, L. The familial dementia BRI2 gene binds the Alzheimer geneamyloid-beta precursor protein and inhibits amyloid-beta production. J. Biol. Chem.2005, 280, 28912-28916. 41. Frye, B.C.; Halfter, S.; Djudjaj, S.; Muehlenberg, P.; Weber, S.; Raffetseder, U.; EnNia, A.; Knott, H.; Baron, J.M.; Dooley, S.; Bernhagen, J.; Mertens, P.R. Y-box

ACS Paragon Plus Environment

29

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 38

protein-1 is actively secreted through a non-classical pathway and acts as an extracellular mitogen. EMBO Rep. 2009, 10, 783-789. 42. Brown, K.J.; Formolo, C.A.; Seol, H.; Marathi, R.L.;Duguez, S.; An, E.; Pillai, D.; Nazarian, J.; Rood, B.R.; Hathout, Y. Advances in the proteomic investigation of the cellsecretome. Expert Rev Proteomics. 2012,9, 337-345. 43. Greening, D.W.; Kapp, E.A.; Ji, H.; Speed, T.P.; Simpson, R.J. Colon tumoursecretopeptidome: insights into endogenous proteolytic cleavage events in thecolon tumour microenvironment. Biochim.Biophys.Acta2013,1834, 2396-2407. 44. Lyons, P.J.; Fricker, L.D. Peptidomic approaches to study proteolytic activity. Curr. Protoc. Protein Sci. 2011, Chapter 18:Unit18.13. 45. Kim, Y.G.; Lone, A.M.; Nolte, W.M.;Saghatelian, A. Peptidomics approach to elucidatethe proteolytic regulation of bioactive peptides. Proc Natl AcadSci U S A. 2012, 109, 8523-8527.

ACS Paragon Plus Environment

30

Page 31 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1. Sequences of peaks labeled in Figure 1 RT (min)

Precursor

Mass

N-term N-term enz Sequence

A

66.84

CO1A2

C-term

4813.29

GPQG

FQGPAGEPGEPGQTGPAGSRGPAGPPGKAGEDGHPGKPGRPGERGVVGPQG

ARGF

B C

72.32 72.32

CCD80 CO1A2

MMP9

2421.32 2504.28

QSSA LATC

LDSDGRPGRKVPLASPISSRSAR QSLQMGSVRKGPTGDRGPRGQRGP

YLRH AGPR

Furin-like

D

78.00

CO1A2

QSLQMGSVRKGPTGDRGPRGQRGPAGP

RGRD

E

82.71 86.09

CCD80

2265.22

QSSA

LDSDGRPGRKVPLASPISSRSA

RYLR

F

CCD80

4872.51

QSSA

LDSDGRPGRKVPLASPISSRSARYLRHTGRSGGVEKSTQEEPNPQ

SFQR

G

88.27

CO1A2

2729.39

MMP2

3741.88

LATC

GLAG

MMP2

LHGDQGAPGPVGPAGPRGPAGPSGPIGKDGRSGHPGPVGPAG

C-term enz Modification Hydroxylation (2 Pro, 2 Lys) Gln->pyro-Glu (N-term Q) Gln->pyro-Glu (N-term Q)

VRGS

H

96.32

CCD80

2853.57

QSSA

LDSDGRPGRKVPLASPISSRSARYLR

HTGR

I

98.58

CCD80

1776.98

QSSA

LDSDGRPGRKVPLASPI

SSRS

J

102.45

TYB4

4976.48

M

SDKPDMAEIEKFDKSKLKKTETQEKNPLPSKETIEQEKQAGES

-

Acetyl (N-term ); Oxidation (M)

K L

105.32 111.77

VIME LTBP2

3473.63 3597.84

M TSHA

STRSVSSSSYRRMFGGSGTSSRPSSNRSYVTT QRDSVGRYEPASRDANRLWRPVGNHPAAAAAKV

STRT YSLF

Acetyl (N-term ) Gln->pyro-Glu (N-term Q)

M

114.81

TYB4

4960.49

M

SDKPDMAEIEKFDKSKLKKTETQEKNPLPSKETIEQEKQAGES

-

Acetyl (N-term )

N

118.85

VIME

4339.06

M

STRSVSSSSYRRMFGGSGTSSRPSSNRSYVTTSTRTYSLG

SALR

Acetyl (N-term )

O

124.42

TYB10

4933.52

M

ADKPDMGEIASFDKAKLKKTETQEKNTLPTKETIEQEKRSEIS

-

ADML ANGL2

4019.00

RRRR

3086.47

TVGA

P

125.77

Q

130.13

R

137.36

ATS2

2764.43

LVAA

S

141.88 150.11 167.38

CATL1

3031.50

TALA

T U

CSTN1 ADML

3252.57 5725.70

HANH RVKR

ADAM10/17 MAAQPQFVHPEHRSFVDLSGHNLASPHPF Furin-like YRQSMNQGSRSTGCRFGTCTMQKLAHQIYQFTDKDKDGMAPRNKISPQGY

V

172.89

ITM2B

2656.25

IQKR TSHA

Furin-like

RDRR

Furin-like

W

181.24

X

193.21

LTBP2 CSF1

5410.75 4373.19

Furin-like

SLPEVLRARTVESSQEQTHSAPASPAHQDISRVSRL

-

ATGPEADVEGAEDGSQREYIYLNRYKR

AGES

TETPGGPPGYGAERILAVPVRTDAQGR TPKFDQTFNAQWHQWKSTHRRLYG

Furin-like

Acetyl (N-term ) Ser phosphorylation Furin-like

LVSH TNEE AVVP GRRR -

EASNCFTIRHFENKFAVETLICS QRDSVGRYEPASRDANRLWRPVGNHPAAAAAKVYSLFREPDAPVPGLSPS

EWNQ

SPAELKGGPASEGAARPVARFNSIPLTDTGSSIQDPQTSAFVF

WVLG

Furin-like

Amidated (C-term ); Disulfide (2 Cys) Disulfide (2 Cys) Gln->pyro-Glu (N-term Q)

Table 1 RT, retention time.Mass, theoretical monoisotopic mass (Da). N/C-term, N/Cterminal flanking four aa. Cells shaded in orange indicate that the identified sequence flanks the annotated signal peptide. Conflicts shown in pale brown. N/C-term enz, responsible enzyme annotated in MEROPS. “Furin-like” is based on the consensus motif of this enzyme family24. Modified residues are shown in bold underlined, but N-term acetylation and C-term amidation are not labeled. See Figure S3 for MS/MS spectra of the hydroxylated CO1A2 peptide and the ADML phosphopeptide.

ACS Paragon Plus Environment

31

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 38

Table 2. Proteins with a signal peptide represented by N-terminal sequences Precursor

Category

ELN PGS2 ECM1 CO1A2 FINC FBN1 PGS1

ECM ECM ECM ECM ECM ECM ECM

Signal peptide annotation Matched Matched Matched Unmatched Unmatched Unmatched Unmatched

Most N-terminal reads belong to: Mature protein Propeptide* Mature protein Mature protein Mature protein Mature protein Propeptide*

Unmatched Matched

CCD80

ECM

Matched

Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Enzyme Type I Type I Type I Type I Type I Type I Type I Secreted Secreted Secreted Secreted Secreted Secreted

Matched Matched Matched Matched Matched Matched Matched Matched Unmatched Unmatched Unmatched Matched Matched Matched Matched Matched Unmatched Unmatched Matched Matched Matched Matched Matched Unmatched

ADML

Secreted

Unmatched

49/800 6/6 11/11 139/406 14/206 17/17 81 (Leu17)/148 31 (Phe19)/148 13 (Glu20)/148

Mature protein

Unmatched Unmatched

MMP2 ADA17 ADA9 PCSK6 LYOX LOXL1 NAGPA CATL1 ADA10 ATS2 ATS5 CADH3 FXYD6 PXDC2 SORT CAD11 SELPL TNR6 ADM2 AUGN EDN1 LTBP2 FNDC1 ANGL2

Most N-terminal reads/ total reads, n

140 (Leu28)/348 71 (Ser23)/348 64 (Gln24)/348

Propeptide Propeptide Mature protein Propeptide* Propeptide Propeptide* Propeptide* Propeptide Propeptide Propeptide Propeptide Propeptide* Mature protein Mature protein Propeptide* Propeptide* Propeptide* Mature protein Propeptide Propeptide* Propeptide* Mature protein Mature protein Mature protein

Matched

Bioactive peptide*

18/25 21/21 7/7 3/6 15/17 6/6 4/4 21/31 19/38 26/29 9/11 15/15 9/11 17/30 28/33 11/23 10/11 3/3 22/23 8/15 5/21 59/130 8/8 30 (Ala20)/52

Rep 1 n

Rep 2 n

Rep 3 n

Rep 4 n

7 3 26 3 4 15 6 29 12 9 7 1 1 4 11 5

9 2 9 63 10 5 23 12 9 65 46 36 8 8 5 2 9 1 2 16 14 19 5 1 11 16 7 9 1 12 6 4 35 7 13

6 2 35 1 1 26 6 7 7 13 2 1 4 2 2 4 3 8 8 5 4 3 1 6 2 1 7 1 6

27 1 15 1 7 42 13 4 20 7 12 3 6 1 3 1 4 2 2 1 7 1 1 6 6

19 (Thr21)/52

4

5

6

4

11/139

3

4

4

-

Table 2 Precursors are shown in the order they appear in the text. Coloring is the same as defined in Table 1. Most N-terminal reads refer to those considered nearest/adjacent to potential signal peptides, e.g. “81 (Leu17)” for PGS1 indicates that 81 reads started with Leu17. n, number of reads. Minor reads are indicated in a smaller font. Asterisks indicate that intact sequences were obtained. Breakdown of the most N-terminal reads is shown for each biological replicate. Type I, type I membrane protein.

ACS Paragon Plus Environment

32

Page 33 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure Legends Figure 1.Base peak chromatogram of the peptide fraction (