Positive Enrichment of C-Terminal Peptides Using Oxazolone

Sep 10, 2015 - Furthermore, to improve identification of C-terminal peptides, the database searching strategy was also optimized by defining biotin as...
2 downloads 9 Views 630KB Size
Subscriber access provided by CMU Libraries - http://library.cmich.edu

Article

Positive enrichment of C-terminal peptides using oxazolone chemistry and biotinylation Minbo Liu, Caiyun Fang, Xiuwen Pan, Hucong Jiang, Lijuan Zhang, Lei Zhang, Ying Zhang, Pengyuan Yang, and Haojie Lu Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.5b02437 • Publication Date (Web): 10 Sep 2015 Downloaded from http://pubs.acs.org on September 15, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Positive enrichment of C-terminal peptides using oxazolone chemistry and biotinylation Minbo Liu,†,§ Caiyun Fang,‡,§ Xiuwen Pan,† Hucong Jiang,‡ Lijuan Zhang,‡ Lei Zhang,† Ying Zhang,† Pengyuan Yang,†,‡ and Haojie Lu*,†,‡ † ‡

Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China. Department of Chemistry, Fudan University, Shanghai 200433, P. R. China.

ABSTRACT: Selective capture of protein C-termini is still challenging in view of the lower reactivity of carboxyl group relative to amino groups and difficulties in site-specifically labeling carboxyl group on the C-terminus rather than that on the side-chains of acidic amino acids. For highly efficient purification of C-terminus peptides, a novel positive enrichment approach based on the oxazolone chemistry has been developed in this study. A bifunctional group reagent containing biotin and arginine was incorporated into the C-terminus of protein. Together with streptavidin affinity strategy, the C-terminal peptides could be readily purified and analyzed by MS. Unlike negative enrichment approach, C-terminal peptides, other than non-C-terminal peptides, were captured directly from peptide mixture in this new method. The labeling efficiency (higher than 90%), enrichment selectivity (purifying C-terminal peptides from mixtures of non-C-terminal peptides at a 1:50 molar ratio) and ionization efficiencies in MS were dramatically improved. Moreover, the highly efficient identification of C-terminal peptides was further achieved by defining biotin as the 21st amino acid and optimizing database search strategy. We have successfully identified 183 C-terminal peptides from thermoanaerobacter tengcongensis using this creative method, which affords a highly selective and efficient purification approach for C-terminomics study.

Protein C-terminal domain can regulate protein’s structure and often defines protein's function to play a key role in many biological processes such as protein recognitions and regulations1,2. For example, the C-terminus of firefly luciferase proteins had a conserved tripeptide sequence and was functionalized as a peroxisomal targeting signal3. Different Glycosyl-phosphatidylinositol (GPI) anchors shared common structural features including linkage to the carboxyl group of terminal amino acid via ethanolamine phosphate, for the attachment of a variety of eukaryotic cell surface molecules to lipid bilayer4,5. The aberrant C-terminal sequences would cause some kinds of diseases6. Furthermore, the protein Cterminus was a highly specific sequence tag and could be used to identify most proteins. It was reported that C-terminal tags of four amino acid residues were unique for between 74% and 97% of proteins, which depended on the species studied7. Hence, it is of great importance to identify protein C-terminus for accurate protein identification and understanding their functions. However, there are still limited approaches to efficiently determine the sequence of protein C-terminus so far. As a complement to Edman degradation for N-terminus, chemical analysis, such as alkylation followed by cleavage with isothiocyanate, could be used for C-terminal protein sequencing8. But the alkylation chemistry had low repetitive yields and could only identify a few (3-5) amino acids from the C-terminus of a protein or lengthy peptide, which led to a low sensitivity of C-termini detection9. Because of its vast expansion in the past two decades, mass spectrometry (MS) based methods brought a revolution in the area of proteome research10,11. A MS-based, ladder-sequencing approach was reported to characterize protein C-terminal sequence using CNBr cleavage coupled with carboxypeptidase-degradation strategy, which could be used for identification of as many as 12 C-terminal amino acids12. Besides, dual-isotopic labeling strategies have been widely applied to recognition and

identification of C-terminal peptides. The samples were firstly labeled with isotopic reagents by enzymatic or chemical labeling. Due to the discriminable mass spectrometric pattern between C-terminal and non-C-terminal peptides after isotopic labeling, the C-terminal peptides could be recognized from mass spectra13-15. Recently, we have raised a novel approach for identification and quantitation of C-terminal peptides using dual-isotopic arginine labeling. An isotopic mixture consisting of 50% arginine and 50% heavy-labeled C6-arginine was reacted with C-termini of proteins through oxazolone chemistry. The resulting proteins were digested and then Cterminal peptides could be directly recognized from other internal peptides according to their unique isotopic paired peaks15. Although these methods can be used to distinguish the C-terminal peptides and contributed to our understanding greatly, they tended to be very cumbersome and inefficient due to the overwhelming number of non-C-terminal peptides and complexity of mass spectrum. As a result, these approaches were mainly applied in simple systems, and there is an urgent need to develop isolation and enrichment strategies for C-terminal peptides. Compared with N-terminal peptide enrichment approaches, methods for selective capture of C-terminal peptides are relatively lacking due to the following reasons. First, reactivity of carboxyl group relative to amino groups are lower; Second, it is particularly challenging to label carboxyl groups in the protein C-terminus site-specifically rather than carboxylic acid side-chains of aspartic and glutamic acids which are much more abundant in a typical protein16,17. Throughout the existing C-terminal capture strategies, they mainly relied on so-called “negative enrichment”, in which non-C-terminal peptides were captured and removed, leaving C-terminal peptides in the sample solution to be determined18-20. Chait et al. described a negative enrichment strategy for the first time to isolate C-terminal peptides in which the endoprotease LysC proteolytic peptides were captured and removed by

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

anhydrotrypsin-agarose beads, while C-terminal peptides did not terminate in lysine or arginine residues and remained in solution18. Overall et al. developed an alternative strategy, in which internal tryptic peptides were coupled to polyallylamine polymers via their free carboxyl termini, while the orginal protein C-terminal peptides remained unbound thus being isolated by ultrafiltration after blocking their amino/carboxyl groups20. However, because most peptides were not Cterminal peptides in the sample, the negative capture process should be highly efficient to remove all of non-C-terminal peptides and was usually very difficult to achieve in a complex sample. In contrast, positive enrichment strategy involved selective purification of targets directly from a mixture of peptides, thus having more advantages. But it was very challenging and rarely reported since little was known about how to specifically react with the α-carboxyl group on Cterminus. Jaffrey et al. described an Enzymatic Labeling approach (ProC-TEL) for positive selection of C-terminal peptides based on carboxypeptidase Y, which could selectively recognize protein C-termini and hydrolyze Cterminal residues21. Herein, we developed a novel approach for positive enrichment of C-terminal peptides using oxazolone-based chemistry. A bi-functional reagent containing biotin and arginine was introduced to protein C-terminus through an oxazolone-like intermediate. After protein digestion, the biotin-derivatized C-terminal peptides were selectively purified by streptavidin beads. Under optimized derivatization conditions, the labeling efficiency of higher than 90% could be achieved. The newly incorporated basic residue arginine could significantly improve ionization efficiency of C-terminal peptides. Moreover, this approach possesed high selectivity for C-terminal peptides at a low molar ratio of C-terminal/nonC-terminal peptides (1:50), which made it suitable for highthroughput identification of protein C-termini on a proteomic scale. Furthermore, to improve identification of C-terminal peptides, the database searching strategy was also optimized by defining biotin as the 21st amino acid and compiling it into the original protein database together with additional arginine, due to special dissociation mode of the labeled peptides in MS2. Finally, up to 183 C-terminal peptides of thermoanaerobacter tengcongensis were successfully identified, which greatly enhanced data scale compared to previous studies. Experimental Section 2.1 Materials and Chemicals Myoglobin (horse), cytochrome C (horse), α-chymotrypsin, dithiothreitol (DTT), iodoacetamide (IAA), pentafluorophenol (PfpOH), ammonium bicarbonate, trifluoroacetic acid (TFA), MALDI matrix (α-cyano-4-hydroxycinnamic acid, CHCA), acetonitrile (ACN), and streptavidin magnetic beads were purchased from Sigma (St. Louis, MO, U.S.). Formic acid and triethylamine were obtained from Tedia (Fairfield, OH, U.S.). Acetic anhydride was purchased from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China). All standard peptides (95%) and arginine-biotin hydrazide (Arg-NHNH-Biotin, ABH) were synthesized from Chinese Peptides CO., Ltd. (Hangzhou, China). All chemicals were used as received

Page 2 of 9

without further purification. Deionized water was used for all experiments. 2.2 Sample Preparation from Thermoanaerobacter tengcongensis Thermoanaerobacter tengcongensis (TTE) was cultured in media at 55 °C condition as previously described22. The TTE was harvested by centrifuging at 4000 g for 30 min and washed twice with phosphate-buffered saline (PBS). The pellet was lysed in 200 µL of lysis buffer (150 mM Tris-HCl, pH=7.8), with sonication on ice. The extracted proteins were reduced with 10 mM DTT at 56 °C for 30 min, followed by alkylation in 55 mM IAA for 30 min in the dark at room temperature. After being precipitated in 600 µL of ice-cold acetone overnight at -20 °C, the treated proteins were pelleted by centrifugation at 12 000 g for 30 min at 4 °C and stored at 20 °C for further use. 2.3 C-terminal Biotinylation of Peptides and Proteins For each standard peptide (2 µg) or protein (10 µg), the sample was dissolved in a mixture of acetic anhydride (100 µL) and formic acid (100 µL) together with pentafluorophenol (PfpOH; 100 µmol), and then incubated at 60 °C for 30 min. After the resulting samples were dried by vacuum centrifugation, the same procedure was repeated twice. Then, the activated samples were dissolved in 20 µL of aqueous derivatization reagents ABH (20 mM) with 0.5 µL of triethylamine, and incubated for 2 hours at room temperature. The derivatization reaction was terminated by evaporating the mixtures to dryness in a Speedvac. The ABH-derivatized protein was then redissolved in 100 µL of ammonium bicarbonate buffer (25 mM, pH 8.0) and digested by α-chymotrypsin at 37 °C overnight with a substrate/enzyme ratio of 40:1 (w/w). Both the ABH-labeled peptides and protein digests were desalted with C18 Ziptip micropipette tips from Millipore (Billerica, MA, U.S.). 2.4 Selective Enrichment of C-terminal Peptides The ABH-derivatized C-terminal peptides were captured by streptavidin magnetic beads. The commercial magnetic beads were prewashed for 3 times with 100% ethanol, followed by 25 mM ammonium bicarbonate solution for three times, and then dispersed in deionized water (10 mg/mL). The peptide mixtures were dissolved in 100 µL of loading buffer (25 mM ammonium bicarbonate, 50% ACN) and incubated with 20 µL of streptavidin beads at 37 °C for 60 min. Afterward, the beads with captured C-terminal peptides were separated from the mixed solution using an external magnetic field and washed with loading buffer for three times to remove nonspecifically bound peptides and other impurities. Finally, the C-terminal peptides were eluted by boiling the beads in 20 µL of eluent (50% ACN containing 1% TFA) for 10 min, and then the eluate was dried for further MS analysis. 2.5 MALDI-TOF MS and HPLC-MS/MS Analysis Standard model samples were analyzed in positive reflection mode on an AB Sciex 5800 Proteomics Analyzer. Followed by adding 0.5 µL of matrix solution (0.6 mg/mL of CHCA in 50% ACN/0.1% TFA), 0.5 µL of sample was spotted onto a MALDI target plate and air-dried. The acquired mass spectra were interpreted manually using Data Explorer V4.5 (Applied Biosystems).

ACS Paragon Plus Environment

Page 3 of 9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

The complex biological samples were analyzed on a nanoLC-ESI-MS/MS system (LTQ-Orbitrap mass spectrometer from ThermoFisher Corporation). The enriched C-terminal peptides of TTE were dissolved in 0.1% formic acid and then loaded onto a CAPTRAP column (0.5 × 2 mm, MICHROM Bioresources Inc., Auburn, CA) for 5 min at a flow rate of 60 µL/min. The sample was subsequently separated by a PICOFRIT C18 reverse-phase column (0.1 × 150 mm, New Objective Inc., Woburn, MA) under a flow rate of 300 nL/min. The mobile phases consisted of 2% ACN with 0.1% formic acid (phase A) and 95% ACN with 0.1% formic acid (phase B). A 90-min linear gradient of 5 to 45% phase B was employed for separation. The eluates were introduced into the mass spectrometer via a 15 µm silica tip (New Objective, Inc., Woburn, MA) adapted to a DYNAMIC nanoelectrospray source (Thermo Electron Corporation, San Jose, CA). The mass spectrometer was operated in data-dependent manner and with full-MS scan from 350-1800 m/z, resolution at 60 000. The top 10 precursor peaks were selected for CID fragmentation. The AGC expectation during full-MS and MS/MS were 500 000 and 10 000, respectively. 2.6 Data Analysis The raw mass spectrometry data file was initially converted into Mascot Generic Format (MGF) through MM File conversion software (Version 3.9) and then MS/MS spectra were analyzed using Mascot 2.3 (Matrix Science, Boston, MA, USA) against the C-terminus-modified thermoanaerobacter tengcongensis database from NCBI. The following search parameters were used: variable modification due to methionine oxidation, fixed modification due to cysteine carbamidomethylation and lysine formylation, including mass tolerances of ±20 ppm for parent ions and ±1 Da for fragment ions. Only resulting peptides with expectation values below 0.05 and in the first rank were regarded as positively identified. 3. Results and Discussion 3.1 Site-specific Biotinylation on C-terminus via Oxazolone Chemistry Oxazolone-based chemistry is regarded as one of the few effective approaches that can site-specifically react with Cterminal α-carboxyl group15,23-24. In brief, by forming an oxazolone-related intermediate by the action of formic acid and acetic anhydride, the α-carboxyl group on protein Cterminus could be discriminated from side chain of acidic amino acids (aspartic acid or glutamic acid). Then the activated α-carboxyl group further reacted with amine reagent for derivatization. In this work, a bifunctional reagent, arginine-NHNH-biotin, was chosen to modify site-specifically the C-terminal peptides, in which arginine was used as an amine reactive group and biotin as a tag target for selective enrichment. The biotinylation efficiency and specificity of α-carboxyl group on protein C-terminus was evaluated first using a standard peptide with sequence of VVLQSKELLNSIGFS,

which contained α-carboxyl group, side-chain carboxyl group and other potential active sites including α-amino group at Nterminus, ε-amino and hydroxyl groups at the side chain. So all potential side reactions were considered comprehensively. MALDI-TOF MS/MS was used to determine and interpret the biotinylated peptides. Before derivatization, the MS peak was predominantly signal of the peptide’s protonated ion together with a slight sodiated adduct (Figure 1a). After being derivatized, a predominant peak was observed at m/z of 2086.37 (Figure 1b). Because both α- and ε-amino groups from peptide could be completely formylated in the presence of formic acid and acetic anhydride25, we concluded that the mass shift of 452.5 Da was probably involved in two formylation sites on amine groups and one ABH derivatization. They were further confirmed using tandem MS (shown in Figure 1c) and the continuous b-/y- ion series disclosed modification details of peptide. Compared with the underivatized peptide, a mass shift of 396 Da was observed for all the y type ions rather than b ions, indicating that ABH was specifically linked with α-carboxyl group at peptide Cterminus instead of that on side chain. In addition, a weak peak at m/z 1694.08 was also detected, corresponding to an incomplete reaction product with two amine groups formylated. Ulteriorly, myoglobin and cytochrome C were used as protein models to determine the feasibility and reliability of our labeling approach. The proteins were labeled with ABH, followed by protein digestion and MALDI-TOF MS/MS analysis. To generate C-terminal peptides with appropriate length for MS analysis, chymotrypsin was chosen as digestive enzyme. The summary of resulting peptide fingerprint mapping by Mascot was provided in Table S1. As shown in Figure 1d and 1e, the C-terminal peptides of myoglobin (Y.KELGFQG.-, m/z 1202.79) and cytochrome C (Y.LKKATNE.-, m/z 1255.64) were detected definitely, in which all lysines were formylated and ABH were labeled onto C-terminal α-carboxyl groups. No obvious underivatized candidates were detected, indicating that the oxazolone-based biotinylation could be well realized on the protein level too. More importantly, the ionization efficiencies of C-terminal peptides were improved greatly after ABH derivatization due to the newly incorporated arginine. Before derivatization, the absolute signal intensities of C-terminal peptides were very low (only 183 for myoglobin and 801 for cytochrome C, shown in Figure S1a and S1b), which were not sufficient for MS2 sequencing. However, after ABH was incorporated, the absolute intensity became 479 for Y.KELGFQG.- (myoglobin) and 2792 for Y.LKKATNE.- (cytochrome C), which increased 162% and 249%, respectively. In parallel, ABH derivatization could expand mass range for C-terminal peptide detection because the proteolytic C-terminal peptides with ABH tag (414 Da) had enough length to be identified theoretically by MS/MS as long as they were not less than 3 amino acids. In general, our results showed that oxazolone-based derivatization were highly efficient (reaction yield of 90%) and site-specific to C-terminal α-carboxyl group. The complete formylation on amine group would not complicate the mass spectrum interpretation. The reaction condition was

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 9

Figure 1. MALDI mass spectra of the biotinylated derivatization of standard peptide VVLQSKELLNSIGFS (a-c) and standard proteins (d-e): a) underivatized standard peptide; b) ABH-labeled standard peptide; c) MS/MS analysis of derivatized peptide; chymotryptic digests of biotinylated myoglobin (d) and cytochrome C (e). Asterisks in the peptide sequence represent formylated amino acid positions. Pound denotes the C-terminal peptide with ABH labeling. relatively mild to proteins and the resulting amide linkages were stable under acidic or weakly basic conditions. All the advantages of oxazolone-based derivatization facilitated further C-terminal peptide enrichment. 3.2 Selective Enrichment of C-terminal Peptides To reduce the suppression of non C-terminal peptides, it is advantageous to specifically isolate C-terminal peptides from the complex systems. The biotin-streptavidin interaction is one of the strongest known non-covalent biological interaction (Kd=10-14 mol/L), which can resist extremes of pH, temperature, organic solvents and other denaturing agents, so it has been widely used in detection and purification methods of protein and nucleic acid26,27. Here commercial streptavidin magnetic beads were used to selectively purify C-terminal peptides and four different protocols (procedure A-D) were investigated to obtain maximum elution recovery. To mimic the proportion of C-terminal peptides in a real sample, 2 µg ABH-derivatized standard peptide VVLQSKELLNSIGFS was added to 100 µg myoglobin digests as a test sample. The sample was first incubated with streptavidin magnetic beads in loading buffer (25 mM ammonium bicarbonate and 50% ACN). After separation of the beads using an external magnetic field and thorough washing with the loading buffer for three times, the bound peptides were eluted with 50% ACN containing 0.1% TFA and analyzed by MALDI MS. This was procedure A from previously published work27. However, the MS intensity was quite low (as shown in Figure 2a), indicating a poor recovery of C-terminal peptides, which

might be caused by a relatively lower elution efficiency using 50% ACN containing 0.1% TFA28. To destroy the strong interaction between biotin-streptavidin complex, a cleavable linker was introduced27,28 or extreme conditions were adopted30. Therefore, we optimized the elution buffer pH and temperature to improve recovery of C-terminal peptides. Procedure B-D were the same as procedure A except that the captured peptides were eluted with 1% TFA in 50% ACN (procedure B), 5% TFA in 50% ACN (procedure C) or 1% TFA in 50% ACN with boiling (procedure D). The results were presented in Figure 2b-d, respectively. Compared with procedure A, the signal intensity of C-terminal peptide in procedure B was increased by 100% when concentration of TFA was raised from 0.1% to 1%, because more acidic condition facilitated breakdown of the streptavidin-biotin interaction. But when the concentration of TFA reached to 5%, a severe suppression was observed due to a cluster of peaks in the low mass range. It was supposed that the commercial magnetic bead couldn’t endure such an acidic condition, leading to its disaggregation. As shown in Figure 2d, the bound biotinylated peptides were more readily eluted from beads after being boiled in 50% ACN containing 1% TFA for 10 min, since streptavidin-biotin interaction was further destroyed via thermal denaturation. Therefore, the elution condition was finalized as boiling for 10 min in a water bath in 1% TFA/50% ACN and applied firstly to elute C-terminal peptides of myoglobin and cytochrome C (as shown in Figure 3a and b). MS/MS analysis was also conducted to confirm the

ACS Paragon Plus Environment

Page 5 of 9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. MALDI mass spectra of eluate obtained from mixture of standard peptide VVLQSKELLNSIGFS and myoglobin tryptic peptides under different elution conditions: a) 0.1% TFA in 50% ACN; b) 1% TFA in 50% ACN; c) 5% TFA in 50% ACN and d) 1% TFA in 50% ACN and boiling water bath. Pound represents the standard biotinylated peptide.

Figure 3. MALDI mass spectra of standard proteins myoglobin (a, c, e) and cytochrome C (b, d, f). (a) and (b): MS spectrum of Cterminal peptides after streptavidin enrichment; MS/MS spectrum obtained using different database search strategies (c) and (d): ABH was set as a fixed modification; (e) and (f): Arginine and biotin hydrazide were added to protein C-terminus in protein database. Pound represents the C-terminal peptides. Asterisks indicate the formylated amino groups.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

C-terminal peptides labeled with biotin tags (Y.KELGFWG.from myoglobin in Figure 3c and Y.LKKATNE.- from cytochrome C in Figure 3d). No interfering peaks were detected, indicating a high elution efficiency of this optimized enrichment procedure and highly site-specific derivatization to α-carboxyl group via oxazolone chemistry. Interestingly, another C-terminal peptide of cytochrome C (L.KKATNE.-) without missing cleavage site was also clearly detected at m/z 1142.58, which was low abundant and couldn’t be successfully detected without enrichment or derivatization. Our results demonstrated that the positive enrichment approach was highly effective and facilitated identification of low abundant C-terminal peptides. 3.3 Optimization for Database Search of C-terminal Peptides According to conventional way of database searching, ABH labeling was usually set as a fixed modification on protein Cterminus, and mass mapping was carried out against the database on the basis of original peptides without derivatization. Actually, the MS behaviors of amide bonds between arginine and C-terminal carboxyl group or biotin hydrazide were similar to that of normal peptide bonds, so they could also be dissociated into b- and y- ions in a collision-induced dissociation (CID) mode. However, these band y- ion information of ABH tag were not utilized for peptide matching if it was set as a fixed modification in a traditional way, leading to poor confidence of peptide characterizations. Therefore, we optimized database search strategy by dividing ABH tag into two parts, arginine and biotin hydrazide, and writing them into protein database. Arginine was symbolized as R, while biotin hydrazide was defined as a nonexistent 21st amino acid J. For example, the C-

Page 6 of 9

terminal sequence of myoglobin -LGFQG was revised to LGFQGRJ according to the revision rule, so that fragment ion information for the peptide could be made the best, which facilitated identification of the C-terminal peptide. Compared with the MS spectrum obtained from traditional workflow (Figure 3c for myoglobin and 3d for cytochrome C), more fragment information could be achieved and especially fragments from ABH tag had higher intensity (as shown in Figure 3e and 3f). We further confirmed the improvement via Mascot search engine. When ABH tag was set as fixed modification, the ion scores of C-terminal peptides Y.KELGFQG.- (myoglobin) and Y.LKKATNE.- (cytochrome C) were 17 and 23, respectively. After amendment, the ion scores increased to 28 (65%) and 32 (39%), respectively. 3.4 Enrichment of C-terminal Peptides from Thermoanaerobacter tengcongensis C-terminal peptide enrichment efficiency of the oxazolonebased biotinylation strategy was further evaluated using a complex system TTE. TTE is a thermophilic bacterium which was first isolated from a freshwater hot spring in China. The genome encoded 2588 predicted coding sequences, and 1764 of them were classified according to homology to other documented proteins30. Proteins extracted from TTE were derivatized with ABH and subjected to proteolysis by chymotrypsin. To generate an appropriate peptide length for better retention behavior on the reverse phase column and detection by MS, the enzyme-to-substrate ratio was chosen as 1:80 and digestion time was reduced to 4 hours. Afterwards, the C-terminal peptides of TTE were purified using streptavidin beads and identified by LC-MS/MS. The reformed TTE database was used, in which ABH tag was integrated as amino acid R and J.

Figure 4. Characterization of enriched C-terminal peptides: a) length distribution of identified C-terminal peptides; b) molecular weight distribution of identified proteins and c) depiction of derivatized C-terminal amino acids by oxazolone chemistry.

ACS Paragon Plus Environment

Page 7 of 9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

A total of 183 C-terminal peptides were confidently identified in three replicates. In contrast, there were only 36 C-terminal peptides to be determined using traditional tryptic bottom-up strategy. Therefore, our approach greatly enhanced identification of C-terminal peptides due to its high efficiency of enrichment and ionization and optimized database search strategy. To further characterize the captured C-terminal peptides, we analyzed amino acid composition of each protein’s C-termini. Among 183 C-terminal peptides, the distribution of peptide length covered a wide range. As shown in Figure 4a, the longest peptide contained as many as 35 amino acids, while peptides with 3 amino acids were also determined well. Correspondingly, the molecular weight distribution of identified proteins was wide from 5 kD to 60 kD (Figure 4b), depicting that most of proteins and peptides, even very small soluble proteins or very short peptides, could be identified using our approach. It is particularly advantageous to analyze small proteins, which are readily lost during sample preparation and usually have short peptide length for MS detection. Moreover, the oxazolone-based strategy had a high reaction activity and was compatible with almost all types of amino acids on protein C-termini (Figure 4c). Thus, our enrichment approach appears to be suitable for all kinds of peptides and proteins. 4. Conclusions In summary, a simple and robust strategy has been established for purifying and detecting protein C-terminal peptides. Based on oxazolone chemistry, site-specific C-terminal biotinylation can be realized under mild conditions. The biotinylated Cterminal peptides are selectively isolated via biotinstreptavidin interaction and determined by mass spectrometry. Compared with negative enrichment strategies, this positive purification approach exhibits higher enrichment efficiency and lower background of non-C-terminal peptides. At the same time, the introduced ABH tag can make even very short C-terminal peptides appropriate for MS detection, as well as enhance their ionization efficiency. Moreover, the optimized database searching workflow contributes further to identification of C-terminal peptides. Overall, our approach is easy to operate, versatile and compatible with both solutionand gel- based method to profile C-terminomics in complex biological samples, which can be used to shed new light on the functional importance of C-termini. ASSOCIATED CONTENT Supporting Information This material is available free of charge via the Internet at http://pubs.acs.org. AUTHOR INFORMATION Corresponding Author *. Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China; Department of Chemistry, Fudan University, Shanghai 200433, P. R. China. Tel.: +86 21 54237618; Fax: +86 21 54237961. Email: [email protected]

Author Contributions

§ Minbo Liu and Caiyun Fang contributed equally to this work. Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT The work was supported by the NST (2012CB910602 and 2012CB910103), NSF (21335002), MOE (20130071110034) and Shanghai Projects (Eastern Scholar, 15JC1400700 and B109).

REFERENCES (1) Zhang, C. X.; Weber, B. V.; Thammavong, J.; Grover, T. A.; Wells, D. S. Anal. Chem. 2006, 78, 1636-1643. (2) Meinhart, A.; Cramer, P. Nature 2004, 430, 223-226. (3) Gould, S. J.; Keller, G. A.; Hosken, N.; Wilkinson, J.; Subramani, S. J. J. Cell Biol. 1989, 108, 1657-1664. (4) Homans, S. W.; Ferguson, M. A.; Dwek, R. A.; Rademacher, T. W.; Anand, R.; Williams, A. F. Nature 1988, 333, 269-272. (5) Elortza, F.; Mohammed, S.; Bunkenborg, J.; Foster, L. J.; Nuhse, T.; Brodbeck, U. J. Proteome Res. 2006, 5, 935-943. (6) Selkoe D. J. Trends Cell Biol. 1998, 8, 447-453. (7) Wilkins, M. R., Gasteiger, E., Tonella, L., Ou, K. L.; Tyler, M.; Sanchez, J. C.; Gooley, A. A.; Walsh, B. J.; Bairoch, A.; Appel, R. D.; Williams K. L.; Hochstrasser, D. F. J. Mol. Biol. 1998, 278, 599-606. (8) Boyd, V. L.; Bozzini, M.; Zon, G.; Noble, R. L.; Mattaliano, R. J. Anal. Biochem. 1992, 206, 344-352. (9) Samyn, B.; Hardenmen, K.; Van der Eychen, J.; Van Beeumen, J. Anal. Chem. 2000, 72, 1389-1399. (10) Kelleher, N. L.; Lin, H. Y.; Valaskovic, G. A.; Aaserud, D. J.; Fridriksson, E. K.; McLafferty, F. W. J. Am. Chem. Soc. 1999, 121, 806-812. (11) Yates, J. R. Anal. Chem. 2013, 85, 6151-6151. (12) Samyn, B.; Sergeant, K.; Castanheira, P.; Faro, C.; Beeumen, J. V. Nat. Meth. 2005, 2, 193-200. (13) Julka, S.; Dielman, D.; Young, S. A. J. Chromatogr. B 2008, 874, 101-110. (14) Kosaka, T.; Takazawa, T.; Nakamura, T. Anal. Chem. 2000, 72, 1179-1185. (15) Liu, M.; Zhang, L.; Zhang, L.; Yao, J.; Yang, P.; Lu, H. Anal. Chem. 2013, 85, 10745-10753. (16) Chan, A. O.; Ho, C.; Chong, H.; Leung, Y.; Huang, J.; Wong, M.; Che, C. J. Am. Chem. Soc. 2012, 134, 2589-2598. (17) Qin, H.; Wang, F.; Zhang, Y.; Hu, Z.; Song, C.; Wu, R.; Ye, M.; Zou, H. Chem. Commun. 2013, 48, 6265-6267. (18) Sechi, S.; Chait, B. T. Anal. Chem. 2000, 72, 3374-3378. (19) Kuyama, H.; Shima, K.; Sonomura, K.; Yamaguchi, M.; Ando, E.; Nishimura, O.; Tsunsawa, S. Proteomics 2008, 8, 1539-1550. (20) Schilling, O.; Barre, O.; Huesgen, P. F.; Overall, C. M. Nat. Meth. 2010, 7, 508-511. (21) Xu, G.; Shin, S. B. Y.; Jaffrey, S. R. ACS Chem. Biol. 2011, 6, 10151020. (22) Wang, J.; Xue, Y.; Feng, X.; Li, X.; Wang, H.; Li, W.; Zhao, C.; Cheng, X.; Ma, Y.; Zhou, P.; Yin, J.; Bhatnagar, A.; Wang, R.; Liu, S. Proteomics 2004, 4, 136-150. (23) Yamaguchi, M.; Oka, M.; Nishida, K.; Ishida, M.; Hamazaki, A.; Kuyama, H.; Ando, E.; Okamura, T.; Ueyama, N.; Norioka, S.; Nishimura, O.; Tsunasawa, S.; Nakazawa, T. Anal. Chem. 2006, 78, 7861-7869. (24) Kuyama, H.; Nakajima, C.; Nakazawa, T.; Nishimura, O.; Tsunasawa, S. Proteomics 2009, 9, 4063-4070. (25) Sheehan, J. C.; Yang, D.-D. H. J. Am. Chem. Soc. 1958, 80, 11541158. (26) Geurink, P. P.; Florea, B. I.; Li, N.; Witte, M. D.; Verasdonck, J.; Kuo, C. L.; van der Marel, G. A.; Overkleeft, H. S. Angew. Chem. Int. Ed. 2010, 49, 6802-6805.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(27) Girault, S.; Chassaing, G.; Blais, J. C.; Brunot, A.; Bolbach, G. Anal. Chem. 1996, 68, 2122-2126. (28) Verhelst, S. H. L.; Fonovic, M.; Bogyo, M. Angew. Chem. Int. Ed. 2007, 46,1284-1286. (29) Tong, X.; Smith, L. M. Anal. Chem. 1992, 64, 2672-2677. (30) Bao, Q.; Tian, Y.; Yang H. Genome Res. 2002, 12, 689-700.

ACS Paragon Plus Environment

Page 8 of 9

Page 9 of 9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

For TOC only

A novel integrated approach to study C-terminomics sensitively and selectively was developed. In this strategy, a bifunctional group reagent arginine-NHNH-biotin was site-specifically derivatized to protein C-termini via oxazolone-based chemistry, enhancing the ionization efficiency and realizing positive enrichment of C-terminal peptides. The modified biotin and arginine was defined and added to protein database, contributing to highly efficient identification of C-terminal peptides.

ACS Paragon Plus Environment