Systematic Optimization of C-Terminal Amine ... - ACS Publications

Sep 11, 2015 - Center for Reproduction and Genetics, Suzhou Municipal Hospital, Nanjing Medical University Affiliated Suzhou Hospital, Suzhou,. Jiangs...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/ac

Systematic Optimization of C‑Terminal Amine-Based Isotope Labeling of Substrates Approach for Deep Screening of CTerminome Yang Zhang,† Quanze He,‡ Juanying Ye,† Yanhong Li,† Lin Huang,† Qingqing Li,† Jingnan Huang,† Jianan Lu,† and Xumin Zhang*,† †

State Key Laboratory of Genetic Engineering, Department of Biochemistry, School of Life Sciences, Fudan University, Shanghai 200438, China ‡ Center for Reproduction and Genetics, Suzhou Municipal Hospital, Nanjing Medical University Affiliated Suzhou Hospital, Suzhou, Jiangsu 215002, China S Supporting Information *

ABSTRACT: It is well-known that protein C-termini play important roles in various biological processes, and thus the precise characterization of C-termini is essential for fully elucidating protein structures and understanding protein functions. Although many efforts have been made in the field during the latest 2 decades, the progress is still far behind its counterpart, N-termini, and it necessitates more novel or optimized methods. Herein, we report an optimized C-termini identification approach based on the C-terminal amine-based isotope labeling of substrates (C-TAILS) method. We optimized the amidation reaction conditions to achieve higher yield of fully amidated product. We evaluated different carboxyl and amine blocking reagents and found the superior performance of Ac-NHS and ethanolamine. Replacement of dimethylation with acetylation for Lys blocking resulted in the identification of 232 C-terminal peptides in an Escherichia coli sample, about 42% higher than the conventional C-TAILS. A systematic data analysis revealed that the optimized method is unbiased to the number of lysine in peptides, more reproducible and with higher MASCOT scores. Moreover, the introduction of the Single-Charge Ion Inclusion (SCII) method to alleviate the charge deficiency of small peptides allowed an additional 26% increase in identification number. With the optimized method, we identified 481 C-terminal peptides corresponding to 369 Ctermini in E. coli in a triplicate experiments using 80 μg each. Our optimized method would benefit the deep screening of Cterminome and possibly help discover some novel C-terminal modifications. Data are available via ProteomeXchange with identifier PXD002409.

P

With the improved performance of mass spectrometry (MS), MS-based proteomics has become an indispensable technique for protein identification. Usually trypsin is the best choice for bottom-up proteomics strategy since it cleaves highly specific C-terminal peptide bonds of basic amino acid residues, Lys and Arg, which enables nearly all peptides possess at least one basic residue with the exception of C-terminal peptides.10,11 In this case, C-terminal peptides are not favored in MS analysis and seldom discovered in traditional proteomics studies. Therefore, there is an urgent need for the specific and efficient C-termini analysis methods. During the past 2 decades, a series of C-termini identification methods employing a variety of chemical and biochemical reactions have been established.8,12−15 These include controlled

rotein C-terminal sequence and modifications play important roles in various biological functions and processes such as protein sorting, protein activity, and complex formation.1 For human amyloid β-protein, a key protein during the development of Alzheimer’s disease, the heterogeneous Ctermini determined its aggregation rate and starting age.2 For p53, the intensively studied tumor-suppressor, phosphorylation of C-terminal serine residues is important for its DNA-binding activity.3 Protein C-termini are usually exposed and make them often accessible for different interactions.4−7 Therefore, the precise characterization of C-termini is essential for fully elucidating protein structures and understanding protein functions. A proteome-scale C-termini (C-terminomics), accompanied with N-terminomics, allow evidence-based gene annotation and can be used to improve genome annotation8 and are the most important approaches for degradomics, a novel omics-study to reveal real ends generated by proteolysis processes.9 © 2015 American Chemical Society

Received: July 1, 2015 Accepted: September 11, 2015 Published: September 11, 2015 10354

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry acid hydrolysis,13 enzymatic labeling of protein C-termini in H218O,16 combination of cyanogen bromide cleavage and carboxylpeptidase etc.15 However, the application of these methods is limited to simple samples and no large-scale analysis has been reported so far. Recently a so-called positional proteomics strategy was introduced to selectively analyze the C-terminal or N-terminal peptides.17,18 Since the ionization of C-terminal tryptic peptides are more difficult and are often suppressed by the coeluted peptides, it is prerequired to enrich or isolate the C-terminal peptides from others for C-termini analysis. The pretreatment would reduce the sample complexity to a great extent and thus increase the identification opportunity of C-terminal peptides.19−21 Two ground-breaking approaches were developed by Damme22 and Schilling.9 Damme and co-workers adapted the combined fractional diagonal chromatography (COFRADIC) technology to isolate C-terminal peptides from complex peptide mixtures and identified 965 C-termini by 120 LC− MS/MS analyses.22 This method is somehow time- and laborconsuming and may not be easily employed by researchers with less experience. Schilling and co-workers developed the C-terminal aminebased isotope labeling of substrates (C-TAILS).9,23 C-TAILS employed a four-step derivatization approach, two at protein level and the other two at the peptide level. In brief, before digestion the amine and carboxyl groups of proteins were blocked by dimethylation and amidation, respectively; after trypsin digestion, the tryptic peptides were first derivatized with dimethylation to block the nascent amine groups, and then the peptides with nascent carboxyl groups were depleted by incubation with polyallylamine polymer. As a result, only Cterminal peptides would survive from the incubation due to the lack of carboxyl groups. Although the need of only a single LC−MS/MS analysis makes it much more practicable and realizable, the identification number is not as many as that achieved by COFRADIC. Therefore, the optimization of this method would extend its application and benefit the deep screening of C-terminome. Very recently, the same group discovered a LysargiNase and suggested its potential power in C-terminomics.24 Herein, we developed an optimized C-termini identification approach based on the C-TAILS method. The replacement of dimethylation with acetylation to block the amine groups at protein level and the introduction of Single-Charge Ion Inclusion (SCII) allowed us to identify 75% more C-terminal peptides than the conventional C-TAILS. We demonstrated that the optimized method was unbiased to the number of lysine in peptides, more reproducible and led to higher MASCOT scores. With the optimized method, we identified 481 C-terminal peptides corresponding to 369 C-termini in Escherichia coli in a triplicate experiment using 80 μg each. Our method would benefit the deep screening of C-terminome and possibly help discover some novel C-terminal modifications.

zinyl] ethanesulfonic acid (HEPES), triethylammonium bicarbonate (TEAB), 4-morpholineethanesulfonic acid (MES), guanidine hydrochloride, and formic acid were purchased from Sigma-Aldrich (St. Louis, MO). Hydrochloric acid was from Sinopharm shares (Shanghai, China), Poros R3 was from Applied Biosystems (Framingham, MA), and sequencing grade modified trypsin was from Promega (Madison, WI). All the reagents were used without further purification. Protein Extraction from E. coli. E. coli strain DH5α cultures were grown in Luria−Bertani rich medium (10 g/L NaCl, 10 g/L tryptone, and 5 g/L yeast extracts) at 37 °C overnight. E. coli cells were harvested by 2 000g centrifugation at 4 °C for 15 min. The resulted pellet was washed three times with PBS buffer (pH 7.4) and then resuspended in lysis buffer containing 8 M guanidine hydrochloride, 100 mM TEAB, and 10 mM DTT. The slurry solution was sonicated for 9 min (2 s sonication time with 5 s intervals) in ice incubation, and the supernatant was collected by centrifugation at 20 000g for 20 min at 4 °C. A volume of 20 μL of sample aliquot was kept for protein determination using Bradford assay. Subsequently, the sample was submitted to reduction by incubation at 37 °C for 1 h, followed by alkylation using 30 mM acrylamide for 45 min at room temperature. The excess acrylamide was quenched by adding 5 mM DTT. The protein solution was diluted to 1 mg/ mL with buffer containing 200 mM TEAB and 2 M guanidine hydrochloride prior to the following application. Protection of Primary Amines by Dimethylation or Acetylation. The dimethylation method was adapted from Schilling.9 After adding formaldehyde and NaBH3CN to reach a final concentration of 20 mM, the sample solution was kept for 2 h at 25 °C; the procedure was repeated once with freshly prepared reagents to ensure the efficient protection. The acetylation method was adapted from Staes et al.25 The sample solution was incubated twice with 20 mM Ac-NHS at 25 °C for 2 h and the reaction was quenched with 100 mM Tris-HCl at 30 °C for 10 min. Finally 100 mM NH2OH was added and kept at 30 °C for 30 min to reverse the undesired partial acetylation on Ser and Thr residues.26 Protection of Carboxyl Group. The FASP method was adapted for the following procedures.27 The samples after amine-blocking were transferred to Microcon-10 filters and centrifuged at 13 800g for three-time buffer displacement with carboxyl protection buffer (pH 4.5, 4 M guanidine hydrochloride, 200 mM MES, and 2 M ethanolamine). After buffer displacement, 50 mM NHS and freshly prepared 100 mM EDC were added to trigger the reaction at 25 °C. After 2 h reaction, an additional 100 mM EDC was added and the solution was incubated for another 2 h. Trypsin Digestion and Protection of Nα-Amine of Tryptic Peptides. The samples were then submitted to another three-time buffer displacement with digestion buffer (pH 7.5, 2 M guanidine hydrochloride, 200 mM HEPES). After buffer displacement, the samples were digested at 37 °C with trypsin at a ratio of enzyme/protein as 1:100. After 2 h incubation, the samples were diluted 4 times with 20 mM HEPES (pH 7.5) and same amount of trypsin was added for another 10 h digestion. After digestion, the protection of Nαamine of nascent tryptic peptides was carried out using dimethylation twice as described above. Pretreatment of Polyallylamine. A 6.5 mM stock solution of polyallylamine (representing a primary amine concentration of ∼2 M) was prepared in 200 mM MES (pH 4.5), 2 M guanidine hydrochloride, and 20% (v/v) acetonitrile.



EXPERIMENTAL SECTION Materials and Chemicals. Microcon-10 (10-kDa cutoff) was purchased from Merck-Millipore (Bedford, MA). Dithiothreitol (DTT), α-cyano-4-hydroxycinnamic acid (CHCA), ethanolamine, polyallylamine (Mw ∼ 17 000, 20 wt % in H2O), N-hydroxysuccinimide, N-(3-(dimethylamino)propyl)-N′-ethylcarbodiimide hydrochloride (EDC), sodium cyanoborohydride, formaldehyde solution, 2-[4-(2-hydroxyethyl)-1-pipera10355

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry

Figure 1. Optimization of carboxyl protection reaction. The amidated BSA sample was submitted for trypsin digestion and then analyzed by MALDI-TOF MS: (A) amidated products of peptides RPCFSALTPDETYVPK (m/z 1938.0 and 1981.0 for partially and fully derivatized forms, respectively) and ECCHGDLLECADDR (m/z 1963.9 and 2006.9 for partially and fully derivatized forms, respectively); (B) amidated products of peptide DAFLGSFLYEYSR (m/z 1610.8 and 1653.8 for partially and fully derivatized forms, respectively). Top panel, conventional method; bottom panel, optimized method.

PepMap100 C18 Nano Trap Column (5 μm, 100 Å, 100 μm i.d. × 2 cm, (Thermo Fisher Scientific, Sunnyvale, CA)) and then analyzed on an Acclaim PepMap RSLC C18 column (2 μm, 100 Å, 75 μm i.d. × 25 cm (Thermo Fisher Scientific, Sunnyvale, CA)). The mobile phases consisted of Solution A (0.1% formic acid) and Solution B (0.1% formic acid in ACN). The derivatized peptides were eluted using the following gradients: 5−35% B in 58 min, 35−90% B in 10 min and 90% B for 5 min at a flow rate of 200 nL/min. Data-dependent analysis was employed in MS analysis: the 15 most abundant ions in each MS scan were automatically selected and fragmented in CID mode. As for the SCII method, only unrecognized charged ions were excluded and the known background ions were input in Reject Mass List to avoid MS/MS analysis.29 For long time analysis, the LC gradients were changed as follows: 5−30% B in 190 min, 35−90% B in 10 min, and 90% B for 10 min at a flow rate of 150 nL/min. All experiments were carried out in triplicate. Data Analysis. The raw data were analyzed by Proteome Discoverer (version 1.4, Thermo Fisher Scientific) using an inhouse Mascot server (version 2.3, Matrix Science, London, U.K.).30 E. coli protein Database (20140509, 4 303 sequences) was downloaded from UniProt. Data were searched using the following parameters: Arg-C as the enzyme; up to two missed cleavage sites were allowed; 10 ppm mass tolerance for MS and 0.5 Da for MS/MS fragment ions; propionamidation on cysteine, protein N-acetylation, acetylation on lysine, ethanolamine protection on aspartate and glutamate, dimethylation on peptide N-terminal as fixed modifications; ethanolamine protection on protein C-terminus, oxidation on methionine as variable modifications. For the analysis of protein neo-Ctermini, the changes were that semiArg-C instead of Arg-C was chosen as the enzyme and ethanolamine protection on peptide C-terminus instead of protein C-terminus was adopted in variable modifications. The incorporated Target Decoy PSM Validator in Proteome Discoverer and the mascot expectation

Prior to application, the polyallylamine solution was submitted to five-time buffer displacement using Microcon-10 in order to remove the unpolymerized monomer. Enrichment of C-Terminal Peptides. An equal volume of stock polyallylamine polymer solution was transferred to the sample containing filter device to achieve a primary amine concentration of ∼1 M. After adjusting pH to 4.5 using 1 M HCl, 50 mM NHS, and 100 mM EDC were added and the sample was kept in a Thermomixer (600 rpm, 25 °C) for 2 h. The EDC step was repeated twice with the last incubation lasting 10 h. The pH was checked at each step and adjusted if needed. After polymer coupling, the solution was filtrated out followed by washing with 10% ACN twice. The filtrate fractions containing C-terminal peptides were pooled, vacuum-dried, and then resuspended using 1% formic acid for following analysis. Sample Desalting. The desalting was employed on a homemade Poros R3 microcolumn.28 After loading the sample onto the R3 microcolumn, the column was first washed with 10 μL of 1% formic acid, and then the bound peptides were eluted with 20 μL of 30% ACN followed by 20 μL of 70% ACN. The two eluates were pooled and dried in a Speedvac. MALDI-TOF MS Analysis. MALDI MS was performed using a Bruker Ultraflex TOF/TOF MS (Bruker, Bremen, Germany). Prior to MALDI MS analysis, the peptide samples were desalted on Poros R3 microcolumns and eluted directly onto the sample supports with matrix solution (10 mg/mL CHCA in 70% ACN, 0.1% TFA). All spectra were obtained in positive reflector mode, and mass spectrometric data analysis was performed using the Bruker FlexAnalysis Software (version 2.4). Nanoflow LC−ESI-MS/MS. LC−ESI-MS/MS analysis was performed using a nanoflow EASY-nLC 1000 system (Thermo Fisher Scientific, Odense, Denmark) coupled to an LTQOrbitrap Elite mass spectrometer (Thermo Fisher Scientific, Bremen, Germany). A two-column system was adopted for all analyses. Samples were first loaded onto an Acclaim 10356

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry

Figure 2. Favored charge state varies by different treatments: (A) peptide AMAWEEHDE; (B) peptide VAALIKEVNKAA, the ε-NH2 of Lys was protected by dimethylation; (C) same peptide as in part B, but the ε-NH2 of Lys was protected by acetylation. The red dashed lines in part A indicate the theoretic signals of triply charged AMAWEEHDE. The intensities of the base peaks are marked with open rectangles. It should be noted that for all three peptides the carboxyl groups were protected by amidation with ethanolamine.

value was used to validate the search results and only the hits with FDR ≤ 0.01 and MASCOT p ≤ 0.01 were accepted for discussion. The introduction of MASCOT p ≤ 0.01 reduced the hit number by around 33% in average and increased the minimum Mascot score from 13 to 26. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD002409.31,32

peptide signals differed 43n Da to the original sample (n is equal to the number of Asp and Glu in the corresponding peptides), which suggested that the amidation were done successfully and most Asp and Glu were amidated. However, as shown in Figure 1 we also observed a few signals from partially amidated peptides, e.g., for peptide RPCFSALTPDETYVPK (m/z 1895.0), both fully and partially amidated products (m/z 1981.0 and 1938.0) could be observed. It was more severe for peptide ECCHGDLLECADDR (m/z 1791.9) and DAFLGSFLYEYSR (m/z 1567.8), and even a series of partially amidated products could be observed for the former (m/z 1920.9 and m/ z 1963.9 correspond to that with two and one unamidated amino acid residues). The lower degree of success of amidation was often observed on the peptides containing multiple acidic amino acids. Therefore, an optimization on reaction conditions would facilitate its application by increasing the efficiency within shorter reaction time. Different optimized amidation conditions have been established.9,33−37 All the studies suggested that the EDC concentration, pH, and reaction time are the key factors for



RESULTS AND DISCUSSION Optimization of the Carboxyl Protection Reaction. Esterification and amidation are among the widely used carboxyl protection methods. Since the ester product is very labile and easy to hydrolyze even at mild conditions, Schilling chose amidation and observed an acceptable protection rate of around 95%. At our initial work, we tested the same amidation protocol using BSA to evaluate its efficiency. The amidated BSA sample was submitted for trypsin digestion and then analyzed by MALDI-TOF MS. It was observed that most 10357

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry

tested as the carboxyl protection reagent for this purpose since it would introduce an additional basic group after amidation. However, the application of N,N-dimethylethylenediamine did not increase the identification number of Ctermini; in contrast, the identification number decreased by about 50% compared to ethanolamine. Manual check of the spectra revealed that many more highly charged signals were observed after N,N-dimethylethylenediamine treatment. The excess charges would be the main reason for the low identification number because usually doubly charged ions are favored for MS/MS analysis.39,40 We assumed that some C-terminal peptides acquired from CTAILS could be in the same case if they contain multiple dimethylated Lys residues. We checked the charge state of identified C-terminal peptides containing different Lys numbers, and it was evident that for peptides in similar mass range the higher charge state can be always observed from those with more Lys residues. As shown in Figure 2A,B, 3+ is the main state for double-Lys containing peptide VAALIKdimEVNKdimAA ([M + H]+ = 1396.92), whereas 2+ for non-Lys containing peptide AMAWEEHDE ([M + H]+ = 1376.66). To investigate whether the high charge state caused by internal Lys might hinder MS identification, we employed a systematic comparison of the identified C-terminal peptides with the theoretical ones (on the basis of the protein data set identified by an independent shotgun analysis) in the terms of Lys number. As demonstrated in Figure 3, a clear bias to nonand mono-Lys containing peptides was observed, which confirmed our assumption that multiple Lys containing peptides have a lower chance to be identified.

amidation and proposed different conditions when dealing with different substrates. It is well accepted that there is no universal reaction conditions fulfilling all different requirements and the optimal conditions vary case by case. We carried out a series of optimization work on the EDC concentration, the ratio of EDC to NHS, pH, and the concentration of ethanolamine and guanidine chloride. As a result, the optimal buffer was determined as 4 M guanidine chloride, 2 M ethanolamine, 200 mM MES, 100 mM EDC and 50 mM NHS, pH 4.5. As shown in Figure 1, the partial amidation problem was dramatically relieved using the optimized method. We also tested the effects of the reaction times (2, 4, 6, and 14 h) and temperatures (25 and 37 °C). For the reaction time, no significant improvement can be observed once it is over 4 h. The temperature seems not to be a crucial factor on the reaction since no significant difference can be observed. Taken together, the optimized reaction conditions were determined as follows: (1) buffer, 4 M guanidine chloride, 2 M ethanolamine, 200 mM MES, 100 mM EDC, 50 mM NHS, pH 4.5; (2) time, 4 h; and (3) temperature, 25 °C. It should be mentioned that another freshly prepared 100 mM EDC was added after 2 h reaction to compensate the EDC loss due to hydrolysis. Removal of Monomer from Polyallylamine. We first employed BSA to investigate the enrichment efficiency of the C-TAILS method and the resulted MALDI-MS spectra were shown in Supplementary Figure 1 (Supporting Information). Surprisingly the C-terminal peptides reported by Schilling was not observed in our spectra; in contrast, many new signals appeared after the polymer enrichment. The comparison of spectra acquired before and after enrichment (top and middle spectra) revealed that a 39 Da increase occurred during polymer enrichment. The +39 Da modification was most likely resulting from the amidation with remained allylamine, C3H7N (Mw = 57.06 Da), the main reagent for polyallylamine synthesis. After amidation with allylamine, the mass increase would be 57.06 − 18.01 = 39.05 Da, corresponding to the addition of allylamine and the loss of water. Subsequently, we introduced an ultrafiltration purification step to remove the possible low mass oligomer from polyallylamine. The spectrum acquired using purified polyallylamine was shown in the bottom spectrum of Supplementary Figure 1 (Supporting Information). All + 39 Da peaks completely disappear, and only some previously suppressed minor peaks remained. However, the removal of monomer did not lead to the detection of Cterminal peptide reported by Schilling. We further analyzed BSA sequence and found that the theoretical C-terminal peptide possesses 100 amino acids with the mass higher than 11 000 Da and not possible to filtrate through the 10 KD Microcon filter used in this study. The C-terminal peptide reported by Schilling actually resulted from a chymotrypsin cleavage,9 which is an inherent shortcoming for regular trypsin.38 The trypsin used in this study was derivatized by methylation and claimed with reduced autolysis and eliminated chymotrypsin activity, which would be the reasonable explanation. The remained peaks detected in the bottom spectrum might come from the impurity in buffers or reagents. Acetylation of Lys Promote the Identification Efficiency. It is well accepted that the detection of C-terminal peptides is often hampered by their low ionization efficiency, and thus we decided to test some charge-enhanced reactions to alleviate the problem. N,N-Dimethylethylenediamine was

Figure 3. Proportion of C-terminal peptides containing different number of lysine residues in three data sets. Dimethylation represents the C-terminal peptides identified from the samples derivatized by dimethylation at the protein level; Theoretical represents the theoretical C-terminal peptides calculated from the protein data set identified by an independent shotgun analysis; and Acetylation represents the C-terminal peptides identified from the samples derivatized by acetylation at the protein level.

To alleviate the overcharged issue for multiple Lys containing peptides, we employed the acetylation instead of dimethylation at protein level to neutralize the ε-amines. As shown in Figure 2B,C, acetylation does greatly reduce the charge state for multiple Lys containing peptides. For VAALIKEVNKAA ([M + H] + = 1396.92), the 3+ form is predominant after dimethylation, whereas it accounts for less than 1% after acetylation. Regarding the intensity of the 2+ form, acetylation 10358

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry results in more than 4 times higher than dimethylation. As a result, we identified 212 ± 33 C-terminal peptides using the enriched fraction from 10 μg of E. coli protein, 30.9% higher than that by dimethylation treatment (162 ± 15). The identification ratio increased as the number of internal Lys increased and even doubled for the C-terminal peptides containing three or more Lys. Regarding the bias against Lys number, as shown in Figure 3, the distribution of identification results by acetylation treatment was nearly identical to that of the theoretic data set, thus the bias was successfully eliminated by acetylation treatment. The detailed identification results can be found in Supplemental Tables 1 and 2 (Supporting Information). As discussed above, the charge reduction effect caused by acetylation facilitates not only the identification number but also the unbiased performance to reflect the real C-terminome. Therefore, acetylation was utilized to protect the ε-amine of Lys and the optimized workflow was elucidated in Figure 4. Increased ACN Content Facilitates the Peptide Recovery. In addition to the charge state, acetylation alters another important aspect: hydrophobicity. The increased hydrophobicity could cause sample loss due to the lower solubility in aqueous solution and more unspecific binding to tubes and the filter membrane. To alleviate the shortage, we tested different ACN concentrations at the purification step using polyallylamine. Three ACN concentrations, 10%, 20%, and 30%, were tested and the detailed results can be found in Supplemental Tables 2−4 (Supporting Information). As summarized in Figure 5, a slight increase in identification number was observed as ACN concentration increases. Application of 30% ACN resulted in the highest identification number, 232 ± 3 C-terminal peptides corresponding to 182 ± 4 C-termini. SCII Method Facilitates the Identification of Low-Mass C-Terminal Peptides. The tryptic C-terminal peptides usually possess a fewer basic group than regular peptides which contain a basic group at each terminus. Although trypsin works as ArgC in this study, the resulting C-terminal peptides possess the same aspects since the internal Lys is neutralized by acetylation. The relatively low charge state had been observed in our previous Nα-acetylation study, in which we developed an SCII method to facilitate the identification of singly charged peptides.41 Therefore, we assumed that the SCII method would also facilitate the C-terminal peptide identification, especially for peptides in the low mass range. The application of SCII method resulted in the identification of 285 ± 23 C-terminal peptides corresponding to 226 ± 17 Ctermini, which were 53 and 44 more than the conventional method. Further analysis of the results revealed that most of newly identified C-terminal peptides possess low masses (range of m/z 650−1500, m/z 1087 on average). The detailed identification results and the charge/mass distribution are presented in Supplemental Table 5 and Supplemental Figure 2 (Supporting Information), respectively. Other Advantages of the Optimized Method. We also observed some other advantages using the optimized method, including better Mascot scoring and higher reproducibility. The predominant doubly charged state facilitates the scoring system since it produced mostly the singly charged b-ion and y-ion series. As shown in Figure 6, it is clear that the change fold increased for higher confident results. The reproducibility of the two triple experiments is demonstrated in Supplemental Figure 3 (Supporting Information). A total of 41.9% were

Figure 4. Workflow of the optimized C-terminomics approach. In order to enrich C-terminal peptides for subsequent LC−MS/MS analysis, amines of protein N-termini and Lys are first protected by acetylation and subsequently carboxyl groups of protein C-termini and acid amino acids (Asp and Glu) are protected by amidation using ethanolamine. After digestion by trypsin, the nascent amines of peptides are then protected by dimethylation. N-terminal and internal peptides are coupled to polyallylamine via their nascent carboxyl groups. Consequently, the protein C-terminal peptides are purified by ultrafiltration.

identified in all three experiments, 66.2% were identified in at least two experiments, and the corresponding percentages for dimethylation experiments are 33.6% and 56.3%, respectively. Another advantage comes from the FASP method which is adopted throughout the purification procedure.27 FASP method allows efficient buffer displacement and is superior to precipitation methods which cause considerable sample loss, especially for sample in low amount. Benefitting from the FASP method, we successfully identified more than 200 annotated Ctermini using only 10 μg of E. coli tryptic digests. Application to E. coli Sample Using a High Amount of Sample. Next we applied the optimized method to explore the protein C-termini using a higher amount of sample and a longer 10359

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry

Figure 5. Summarized identification results of different experiments: Dim_10%ACN means that the samples were derivatized by dimethylation at the protein level and the enrichment was performed in 10% ACN. Ace_xx%ACN means that the samples were derivatized by acetylation at the protein level and the enrichment was performed in xx% ACN. Ace_30%ACN_SCII represents the same samples as Ace_30%ACN sample but analyzed using the SCII method.

high confidence of our results. Data analysis revealed that the predominant occurrence of Glu at the position next to the neoC-termini (42/58), which suggested unusual cleavage activity at the peptide bond N-terminal to Glu in E. coli. In regard to the other 16 new C-termini, we observed nine Arg, two Gly, and only one Ala, Ile, Asp, Pro, and Phe as their C-termini. The relatively high number of Arg might be ascribed to their tryptic aspect, which ensured the improved ionization efficiency and superior identification rate.24 The detailed identification results can be found in Supplemental Tables 6 and 7 (Supporting Information).



CONCLUSIONS In conclusion we developed an optimized approach on the basis of the conventional C-TAILS. The optimization has been achieved by the following procedures: (1) fine-tuning the carboxyl reaction conditions, (2) replacing the dimethylation with acetylation for Lys protection, (3) adjusting the ACN content to increase the peptide yield, and (4) introducing SCII method to facilitate the identification of C-terminal peptides in low mass range. The application of the optimized method allowed one to identify 75% more C-terminal peptides and 57% more C-termini using E. coli sample compared to the conventional C-TAILS method. Moreover our optimized method is unbiased, more reproducible, and with higher MASCOT scores. Our optimized method could be useful for confirmation of gene annotation and determination of neo-Ctermini and therefore could help discover the role of C-termini in protein functions and biological processes.

Figure 6. Comparison of Mascot score distribution by two different treatments. Dim_10%ACN means that the samples were derivatized by dimethylation at the protein level and the enrichment was performed in 10% ACN; and Ace_30%ACN_SCII means that the samples were derivatized by acetylation at the protein level, the enrichment was performed in 30% ACN, and analyzed using the SCII method. . The red line represents the change fold of the identified Cterminal peptide number by the Ace_30% ACN_SCII method over that by the Dim_10%ACN method.

LC gradient. An enriched fraction from 80 μg of E. coli cell lysate was analyzed by LC−MS/MS using a 4 h LC gradient. As a result, we identified 481 C-terminal peptides corresponding to 369 high-confident protein C-termini. We also employed a search using semiArg-C and discovered 58 protein neo-C-termini corresponding to 34 proteins. Among the 34 proteins, 10 were annotated as ribosome proteins, and astonishingly, 30S ribosomal protein S6 (P02358) was found with incredible 15 alternative C-termini, which might indicate the extraordinary maturation or proteolysis process for ribosome protein. It should be emphasized that all the new C-terminal peptides fulfill the Arg-C cleavage at their N-termini, which indicated the



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.5b02451. 10360

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361

Article

Analytical Chemistry



(25) Staes, A.; Impens, F.; Van Damme, P.; Ruttens, B.; Goethals, M.; Demol, H.; Timmerman, E.; Vandekerckhove, J.; Gevaert, K. Nat. Protoc. 2011, 6, 1130−1141. (26) Gevaert, K.; Goethals, M.; Martens, L.; Van Damme, J.; Staes, A.; Thomas, G. R.; Vandekerckhove, J. Nat. Biotechnol. 2003, 21, 566− 569. (27) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat. Methods 2009, 6, 359−362. (28) Gobom, J.; Nordhoff, E.; Mirgorodskaya, E.; Ekman, R.; Roepstorff, P. J. Mass Spectrom. 1999, 34, 105−116. (29) Olsen, J. V.; de Godoy, L. M.; Li, G.; Macek, B.; Mortensen, P.; Pesch, R.; Makarov, A.; Lange, O.; Horning, S.; Mann, M. Mol. Cell. Proteomics 2005, 4, 2010−2021. (30) Perkins, D. N.; Pappin, D. J. C.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551−3567. (31) Vizcaino, J. A.; Cote, R. G.; Csordas, A.; Dianes, J. A.; Fabregat, A.; Foster, J. M.; Griss, J.; Alpi, E.; Birim, M.; Contell, J.; O’Kelly, G.; Schoenegger, A.; Ovelleiro, D.; Perez-Riverol, Y.; Reisinger, F.; Rios, D.; Wang, R.; Hermjakob, H. Nucleic Acids Res. 2013, 41, D1063− D1069. (32) Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolome, S.; Apweiler, R.; Omenn, G. S.; Martens, L.; Jones, A. R.; Hermjakob, H. Nat. Biotechnol. 2014, 32, 223−226. (33) Sehgal, D.; Vijay, I. K. Anal. Biochem. 1994, 218, 87−91. (34) Staros, J. V.; Wright, R. W.; Swingle, D. M. Anal. Biochem. 1986, 156, 220−222. (35) D’Este, M.; Eglin, D.; Alini, M. Carbohydr. Polym. 2014, 108, 239−246. (36) Park, C.; Vo, C. L.; Kang, T.; Oh, E.; Lee, B. J. Eur. J. Pharm. Biopharm. 2015, 89, 365−373. (37) Nakajima, N.; Ikada, Y. Bioconjugate Chem. 1995, 6, 123−130. (38) Keil-Dlouha, V. V.; Zylber, N.; Imhoff, J.; Tong, N.; Keil, B. FEBS Lett. 1971, 16, 291−295. (39) Tabb, D. L.; Smith, L. L.; Breci, L. A.; Wysocki, V. H.; Lin, D.; Yates, J. R., 3rd. Anal. Chem. 2003, 75, 1155−1163. (40) Cheng, X. L.; Wei, F.; Chen, J.; Li, M. H.; Zhang, L.; Zhao, Y. Y.; Xiao, X. Y.; Ma, S. C.; Lin, R. C. J. Anal. Methods Chem. 2014, 2014, 764397. (41) Zhang, X.; Ye, J.; Engholm-Keller, K.; Hojrup, P. Proteomics 2011, 11, 81−93.

Mass spectra of purification of polyallylamine, charge and mass distribution of peptide mass comparison for the SCII and conventional methods, and Venn diagrams of the identified C-terminal peptides (PDF) Tables of detailed identification results (XLSX)

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected] or zhangxumin@hotmail. com. Phone: +86 21 51630575. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by National Natural Science Foundation of China (Grants 31470806 and 31300264), the starting funding for Xumin Zhang from Fudan University, and the Research Fund of the State Key Laboratory of Genetic Engineering, Fudan University.



REFERENCES

(1) Chung, J. J.; Shikano, S.; Hanyu, Y.; Li, M. Trends Cell Biol. 2002, 12, 146−150. (2) Selkoe, D. J. Trends Cell Biol. 1998, 8, 447−453. (3) Mundt, M.; Hupp, T.; Fritsche, M.; Merkle, C.; Hansen, S.; Lane, D.; Groner, B. Oncogene 1997, 15, 237−244. (4) Jacob, E.; Unger, R. Bioinformatics 2007, 23, e225−230. (5) Capitani, M.; Sallese, M. FEBS Lett. 2009, 583, 3863−3871. (6) Ye, F.; Zhang, M. Biochem. J. 2013, 455, 1−14. (7) Stricker, N. L.; Christopherson, K. S.; Yi, B. A.; Schatz, P. J.; Raab, R. W.; Dawes, G.; Bassett, D. E., Jr.; Bredt, D. S.; Li, M. Nat. Biotechnol. 1997, 15, 336−342. (8) Moerman, P. P.; Sergeant, K.; Debyser, G.; Devreese, B.; Samyn, B. J. Proteomics 2010, 73, 1454−1460. (9) Schilling, O.; Barre, O.; Huesgen, P. F.; Overall, C. M. Nat. Methods 2010, 7, 508−511. (10) Olsen, J. V.; Ong, S. E.; Mann, M. Mol. Cell. Proteomics 2004, 3, 608−614. (11) Burkhart, J. M.; Schumbrutzki, C.; Wortelkamp, S.; Sickmann, A.; Zahedi, R. P. J. Proteomics 2012, 75, 1454−1462. (12) Chait, B. T.; Wang, R.; Beavis, R. C.; Kent, S. B. Science 1993, 262, 89−92. (13) Zhong, H.; Zhang, Y.; Wen, Z.; Li, L. Nat. Biotechnol. 2004, 22, 1291−1296. (14) Kishimoto, T.; Kondo, J.; Takai-Igarashi, T.; Tanaka, H. Proteomics 2011, 11, 485−489. (15) Xu, G.; Shin, S. B.; Jaffrey, S. R. ACS Chem. Biol. 2011, 6, 1015− 1020. (16) Kosaka, T.; Takazawa, T.; Nakamura, T. Anal. Chem. 2000, 72, 1179−1185. (17) McDonald, L.; Robertson, D. H.; Hurst, J. L.; Beynon, R. J. Nat. Methods 2005, 2, 955−957. (18) McDonald, L.; Beynon, R. J. Nat. Protoc. 2006, 1, 1790−1798. (19) Impens, F.; Colaert, N.; Helsens, K.; Plasman, K.; Van Damme, P.; Vandekerckhove, J.; Gevaert, K. Proteomics 2010, 10, 1284−1296. (20) Kuyama, H.; Shima, K.; Sonomura, K.; Yamaguchi, M.; Ando, E.; Nishimura, O.; Tsunasawa, S. Proteomics 2008, 8, 1539−1550. (21) Sechi, S.; Chait, B. T. Anal. Chem. 2000, 72, 3374−3378. (22) Van Damme, P.; Staes, A.; Bronsoms, S.; Helsens, K.; Colaert, N.; Timmerman, E.; Aviles, F. X.; Vandekerckhove, J.; Gevaert, K. Nat. Methods 2010, 7, 512−515. (23) Schilling, O.; Huesgen, P. F.; Barre, O.; Overall, C. M. Methods Mol. Biol. 2011, 781, 59−69. (24) Huesgen, P. F.; Lange, P. F.; Rogers, L. D.; Solis, N.; Eckhard, U.; Kleifeld, O.; Goulas, T.; Gomis-Ruth, F. X.; Overall, C. M. Nat. Methods 2015, 12, 55−58. 10361

DOI: 10.1021/acs.analchem.5b02451 Anal. Chem. 2015, 87, 10354−10361