Approach for Identification and Quantification of C-Terminal Peptides

Oct 22, 2013 - C-termini of proteins often play an important role in various biological processes. The determination of the protein C-terminus is cruc...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/ac

Approach for Identification and Quantification of C‑Terminal Peptides: Incorporation of Isotopic Arginine Labeling Based on Oxazolone Chemistry Minbo Liu,† Lijuan Zhang,‡ Lei Zhang,‡ Jun Yao,‡ Pengyuan Yang,†,‡ and Haojie Lu*,†,‡ †

Shanghai Cancer Centre and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China Institutes of Biomedical Sciences, Fudan University, Shanghai 200433, P. R. China



S Supporting Information *

ABSTRACT: C-termini of proteins often play an important role in various biological processes. The determination of the protein C-terminus is crucial because it provides not only distinct functional annotation but also a way to monitor the proteolysis-modified proteins. In this study, an isotopic labeling approach based on oxazolone chemistry was developed to achieve the identification and quantification of C-termini. Aminolysis reagents such as arginine selectively react with the α-carboxyl group at the peptide C-terminus via an oxazolone-like intermediate. Side chain carboxyl groups do not participate in this reaction. When an isotopic mixture consisting of 50% arginine (0Arg) and 50% C6-arginine (6Arg) was introduced to react with C-terminus of protein and followed by proteolysis, the C-terminal peptide could be directly recognized in the mass spectrum due to its unique isotopic paired peaks, and the sequence could be interpreted in MS2. Besides, the incorporation of an additional basic amino acid in the C-terminal peptide greatly enhanced the signal intensity for C-termini detection. Moreover, the isotopic arginine labeling strategy could be applied for relative C-termini quantitation. Our method showed an excellent correlation of the measured ratios to theoretical ratios and high reproducibility within 2 orders of magnitude of the dynamic range. The correlation coefficients (R2) were higher than 0.99, with the coefficients of variation (CVs) ranging from 1.16 to 10.91%. Finally, the approach was used to analyze the Ctermini from Thermoanaerobacter tengcongensis, which was cultured under different temperatures. As a result, 68 C-termini have been identified, and 53 of them were quantified in total using our strategy. In addition, 24 neo-C-terminal peptides have also been discovered.

D

modest sensitivity (20−100 pmol) and low repetitive yields (three to five amino acids worth of information).8,9 In the past decades, matrix-assisted laser desorption/ ionization mass spectrometry (MALDI-MS) and electrospray ionization mass spectrometry (ESI-MS) combined with tandem mass spectrometry (MS/MS) have been routinely used for peptide identification in proteomics area.10 Several approaches based on mass spectrometry have been proposed for C-terminal sequence analysis of proteins and peptides. To characterize the C-termini of intact proteins, the “top-down” approach presents attractive advantages in that a total characterization of a target protein, including its amino acid sequences and post-translational modifications, can be determined.11 However, it is confined by the low efficiency for large proteins and lowthroughput profiling. The majority of current approaches for Cterminal sequencing are based on the bottom-up strategy, including carboxypeptidase ladder sequencing,12,13 anhydrotryp-

istinction and identification of the protein C-terminus is necessary for comprehensively understanding a protein and describing its biological functions because the C-terminus contains valuable information of various biologic processes and post-translational modifications.1,2 For example, the aberrant Cterminal sequence of the human amyloid β-protein played a key role in the pathogenesis of Alzheimer’s disease.3 A conserved tripeptide from the C-terminus of firefly luciferase proteins was functionalized as a peroxisomal targeting signal.4 The C-termini sequencing can also be important to help reveal ragged ends, which are generated from proteolysis processes.5 In addition, characterization of the C-terminus can also be crucial to estimate the purity of proteins for therapeutic product registration.6 However, the identification of the C-terminus is quite challenging. Unlike Edman degradation, which is the benchmark technique for N-terminus sequencing,7 the lack of an efficient approach for C-terminus identification is a major bottleneck in this field. Alkylation followed by truncation of the C-terminus through a cleavage reaction with isothiocyanate was available as a complement to Edman degradation, but the so-called alkylation chemistry was not as efficient as the Edman reaction due to its © 2013 American Chemical Society

Received: June 3, 2013 Accepted: October 22, 2013 Published: October 22, 2013 10745

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

sin isolation,14 a combination of Lys-C digestion and amine capture,15 enzymatic labeling of protein C-termini,16 and comparisons of the difference between Lys-C and Lys-N digestion.17 However, these approaches are not suitable for Cterminus profiling from complex samples. Recently, Overall and colleagues have developed a method, named as C-terminal amine-based isotope labeling of substrates (C-TAILS), which can be used to identify new C-termini of proteolysis-modified proteins (neo-C-termini) and protease substrates.5 The efficiency of C-termini detection is still limited by the completeness of the derivatization on both amino and carboxyl groups on proteins. Moreover, in the C-TAILS approach, the study of neo-C-terminal peptides depended on the comparison between sample mixture with or without derivatization. When the samples were labeled in parallel, the individual differences from the sample handling process should be minimized. One reason for the difficulty in selectively labeling the protein C-terminus is that the carboxyl groups of aspartic and glutamic acids exhibit similar reactivity, and the frequency of their occurrences is always very high in a protein. Oxazolone-based chemistry is regarded as one of the few effective approaches to discriminate between the carboxyl groups at the C-terminus and the side chains.18 The reactions consist of dehydration with formic acid and acetic anhydride to form an oxazolone, followed by aminolysis with amine reagent. Various functional groups were derivatized to enhance the response of C-terminal peptides for sequencing using mass spectrometry.18−21 Nowadays, isotope coding-based strategies have been developed and widely used for quantitative proteomics.22,23 More than a few samples could be labeled with the chemically identical labeling reagents differing in stable isotope content in parallel. The relative quantification can be then realized by comparing the signals arising from the same protein or peptides labeled with heavy or light labeling. The isotope strategy also can be used for identifying the functional process of proteins, which present unique patterns in mass spectra.24,25 Several approaches based on isotope labeling were applied to detect the C-terminal peptide of proteins because the mass spectrometric pattern of the Cterminal peptide is different from that of the internal peptide after isotopic coding.6,26,27 For example, when the proteins were digested in a buffer solution containing 50% 18O-labeled water, the proteolytic peptides except the C-terminal peptide would be labeled with either 16O or 18O, resulting in a mixture of 16O/18O labeled peptides. The single peak of the C-terminal peptide could then be easily recognized from the mass spectrum.26 Here, we present a novel isotopic labeling strategy for the identification and relative quantification of C-termini from complex biological samples. This strategy combines the oxazolone-based chemistry and dual-isotopic arginine labeling. As a result, arginine is specifically incorporated into the C-termini of proteins. After protease cleavage, the C-terminal peptide derivatized with isotopic arginine presents a discriminative pattern from other internal peptides in the mass spectrum. The pattern, which consists of an isotopic pair of peaks with a fixed mass difference, could be detected directly to recognize the Cterminal peptide of proteins, and the sequence could be interpreted in MS2. Moreover, the dual-isotope strategy is proposed for relative quantitation of C-terminal peptides. When the samples isotopically labeled in parallel are combined to be analyzed by MS, the ratios of signal intensities could be measured to determine the relative content of the C-terminal peptides. The results showed that our approach provides good linearity with high reproducibility in 2 orders of magnitude in the dynamic

range. The correlation coefficients (R2) were higher than 0.99, with the coefficients of variation (CVs) ranging from 1.16 to 10.91%. This method was then applied to quantify the C-termini expression changes from Thermoanaerobacter tengcongensis cultured under different temperatures. A total of 68 C-terminal peptides were identified, and 53 of them were quantified. In addition, 24 internal peptides were identified as neo-C-terminal peptides which were probably generated during biological process.



EXPERIMENTAL SECTION Materials and Chemicals. Myoglobin (horse), cytochrome C (horse), bovine serum albumin (BSA), dithiothreitol (DTT), iodoacetamide (IAA), pentafluorophenol (PfpOH), arginine (0Arg), α-chymotrypsin, ammonium bicarbonate, trifluoroacetic acid (TFA), and α-cyano-4-hydroxycinnamic acid (CHCA) were purchased from Sigma (St. Louis, MO). Formic acid and triethylamine were obtained from Tedia (Fairfield, OH). Acetic anhydride was purchased from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China). All synthetic peptides (95%) were obtained from Chinese Peptides Co., Ltd. (Hangzhou, China). 13 C6 L-arginine:HCl (6Arg) was provided by Cambridge Isotope Laboratories (Andover, MA). All the reagents were used without further purification. Protein Extraction from Thermoanaerobacter tengcongensis. Thermoanaerobacter tengcongensis (TTE) was cultured in the media as previously described under different temperature, 55 and 75 °C, respectively.28 The confluent cells were washed twice with phosphate-buffered saline (PBS) after centrifugation at 4000g. Three standard proteins, myoglobin, cytochrome C, and BSA, were chosen as internal standards and spiked into the cell pellet in order to monitor the degradation in sample processing. Then the cell pellet was resuspended in 200 μL of lysis buffer (150 mM Tris-HCl, pH = 7.8) and sonicated on ice to disrupt the cellular content. The proteins were reduced with 10 mM DTT at 56 °C for 30 min and then alkylated by addition of 55 mM IAA for 30 min at room temperature in the dark. For protein precipitation, 600 μL of ice-cold acetone was added to the solution followed by incubation at −20 °C overnight. The proteins were pelleted by centrifugation at 12 000g for 30 min at 4 °C and stored at −20 °C for further use. C-Terminal Arginine Labeling of Peptide and Protein. Each standard peptide (2 μg) or protein (10 μg) was dissolved in a mixture of acetic anhydride (100 μL) and formic acid (100 μL) together with pentafluorophenol (PfpOH; 100 μmol) and then incubated at 60 °C for 30 min. After the solvents were removed by vacuum centrifuge, this procedure was repeated twice. Then, 20 μL of aqueous labeling reagents of arginine or an isotope mixture consisting of 50% 0Arg and 50% 6Arg (50 mM) with 0.5 μL of triethylamine was added to dissolve the activated sample. The reaction was allowed to proceed for 2 h at room temperature before evaporating the mixtures to terminate the derivatization reaction. For proteins, they were then subjected to enzymatic digestion. The arginine-labeled protein was redissolved in 100 μL of ammonium bicarbonate buffer (25 mM, pH 8.0) and digested by proteolytic enzyme at 37 °C with a substrate/enzyme ratio of 40:1 (w/w). Both the peptides and protein digestion were desalted on C18 Ziptip tips (Millipore, Billerica, MA) before mass spectrometry analysis. MALDI-TOF Mass Spectrometry Analysis. Standard samples were analyzed by an Applied Biosystems 5800 Proteomics Analyzer. The sample solution (0.5 μL) was spotted onto a MALDI target and air-dried, followed by addition of 0.5 10746

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

μL of matrix solution (0.6 mg/mL of CHCA in 50% CH3CN/ 0.1% TFA). Positive ion reflector mode was performed, and the spectrum of each spot was obtained by accumulation of 2000 laser shots. The acquired mass spectra were interpreted manually using Data Explorer V4.5 (Applied Biosystems). HPLC-ESI-MS/MS Analysis. The derivated TTE digestion was dissolved in 0.1% formic acid solution and analyzed by nanoLC-ESI-MS/MS. The experiments were performed on an HPLC system composed of 2 LC-20AD nanoflow LC pumps and a 1 LC-20AB microflow LC pump (all from Shimadzu Corporation, Tokyo, Japan) connected to an LTQ-Orbitrap mass spectrometer (Thermo Electron Corporation, San Jose, CA). Sample injection was done via an SIL-20 AC autosampler (Shimadzu Corporation, Tokyo, Japan) and loaded onto a CAPTRAP column (0.5 × 2 mm, MICHROM Bioresources, Inc., Auburn, CA) for 5 min at a flow rate of 60 μL/min. The sample was subsequently separated by a PICOFRIT C18 reverse-phase column (0.1 × 150 mm, New Objective, Inc., Woburn, MA) at a flow rate of 300 nL/min. The mobile phases consisted of 2% acetonitrile with 0.1% formic acid (phase A) and 95% acetonitrile with 0.1% formic acid (phase B). To achieve proper separation, a 90 min linear gradient from 5 to 45% phase B was employed. The separated sample was introduced into the mass spectrometer via a 15 μm silica tip (New Objective, Inc., Woburn, MA) adapted to a DYNAMIC nanoelectrospray source (Thermo Electron Corporation, San Jose, CA). The spray voltage was set at 1.8 kV and the heated capillary at 210 °C. The mass spectrometer was operated in data-dependent mode, and each cycle of duty consisted of one full-MS survey scan at the mass range of 350−1800 Da with a resolution power of 60 000 using the Orbitrap section, followed by MS2 experiments for the 10 strongest peaks using the LTQ section. The AGC expectation during full-MS and MS/MS were 500 000 and 10 000, respectively. Peptides were fragmented in the LTQ section using collision-induced dissociation with helium, the normalized collision energy value was set at 35%, and previously fragmented peptides were excluded for 30 s. Data Analysis. The raw data was initially converted into MGF format with MM File conversion software (Version 3.9). For identification and quantification, all MS/MS spectra were analyzed using Mascot 2.3 (Matrix Science, Boston, MA), which is the database search engine against the Thermoanaerobacter tengcongensis database from NCBI. Carbamidomethylation on cysteine, formylation on lysine, and light-labeled arginine on protein C-termini were set as fixed modifications, whereas oxidation on methionine and heavy-labeled arginine (6 Da) on protein C-termini were set as variable modifications. Mass tolerance was set to 20 ppm for the precursor and 1 Da for the fragment ion. Only resulting peptides with expectation values below 0.05 and in the first rank were regarded as positively identified. For quantification, the C-terminal peptide fold changes were calculated by their average ratio of the isotope pairs.

Scheme 1. Schematic of Arginine Labeling on the Carboxyl Groups of C-Terminus Based on Oxazolone Chemistry

intermediate readily reacts with PfpOH to convert to a relatively stable ester, which could be replaced by an aminolysis reagent such as arginine. To evaluate the labeling efficiency and specificity of C-terminal arginine labeling, a standard peptide with sequence of VVLQSKELLNSIGFS, which contained both α-carboxyl group and a side chain carboxyl group, was selected as the model peptide. Meanwhile, the peptide also possesses other potential active sites such as an α-amino group at the N-terminus, ε-amino, and hydroxyl groups at the side chain to track the potential side reactions. MALDI-TOF MS was used for the analysis of the arginine-derivatized peptide. As shown in Figure 1a, the ion peak of the original peptide was observed at m/z 1633.92. After labeling, a predominant peak with the mass shift of 212.22 was observed (Figure 1b), representing the derivatized peptide with two formylation sites (27.99 Da) and one arginine labeling (156.10 Da). The ε-amino group from the lysine residue and the α-amino group from the N-terminus could be formylated in the presence of formic acid and acetic anhydride.29 A slight dehydration peak accompanied by derivation was also detected. Neither unmodified peptide nor multisite modification of the peptide was observed after the reaction. The modification site was further confirmed by MS/MS (shown in Figure 1c). A series of continuous b and y ions pattern indicated that the internal glutamic acid was not derivatized. The strong fragment ions of m/z 175.14 which represents [Arg + H]+ was detected, illustrating that one single arginine was added to the C-terminus. Furthermore, a mass shift of 156 Da was observed for y ions rather than b ions, which was consistent with the fact that arginine specifically reacted with the α-carboxyl group at the Cterminus instead of that on the side chain. Another two peptides, SASLHLPK and LSPIYNLVPVK, were also synthesized to investigate the labeling efficiency (data not shown). In the presence of the amino group at the C-terminus, we found that the derivatization with arginine was complete. These results demonstrated the high specificity for arginine addition to the free α-carboxyl group and suggested that the modification can be driven to completion. Recognition of the C-Terminus of Proteins by Combination of Isotopic Arginine Labeling. The oxazolone-based labeling of arginine was also performed at the intact protein level. To identify the C-terminal peptide, a combination of 50% normal arginine (0Arg) and 50% 13C6-arginine (6Arg) was used as the derivatization reagent. After being labeled with dual-isotopic arginine, the protein was digested by protease and then analyzed by MALDI-TOF mass spectrometry. A pair of



RESULTS AND DISCUSSION Specific Labeling of C-terminal α-Carboxyl Group with Arginine. To our knowledge, the oxazolone-based chemistry is one of the few reactive intermediates that derive solely from the C-terminal α-carboxyl group of protein.18 As shown in Scheme 1, in the presence of formic acid and acetic anhydride, an oxazolone ring is formed with an α-carboxyl group from the protein Cterminus instead of a carboxyl from the side chain of acidic amino acids (aspartic acid or glutamic acid). The oxazolone-related 10747

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

Figure 1. MALDI mass spectra of standard peptide VVLQSKELLNSIGFS: (a) underivatized peptide, (b) arginine-labeled peptide, and (c) MS/MS analysis of derivatized peptide. Asterisks represent the formylated amino group. Pound represents the fragment ion of [Arg + H]+.

illustrating that the peptide was reacted with the 1:1 0Arg/6Arg isotope reagent. This kind of isotope peak pattern can be easily distinguished, as only the C-terminal peptides could react with the arginine isotope using the oxazolone-based strategy. The sequences of derivatized C-terminal peptides were further identified by tandem mass spectrometry. For myoglobin, the light labeled peptide with m/z 962.52 was subjected to MS/MS analysis (Figure 2b) followed by searching against a database. The amino acid sequence was identified as Y.KELGFQG.-, which corresponds with the theoretical C-terminal peptide of myoglobin. The mass shift of 184.11 Da between theoretical peptide mass (m/z 778.41) and observed mass (m/z 962.52) indicated the formylated lysine residue and an arginine-labeled carboxyl group. A series of sequential b and y ions confirmed that the α-carboxyl group from glycine instead of the side chain of glutamic acid was specifically derivatized. A similar result for cytochrome C was shown in Figure 2e. The sequence of the lightlabeled peptide (m/z 1015.60) was determined as Y.LKKATNE.- by MS/MS analysis, indicating that the

peaks with mass difference of 6 Da could be detected in the mass spectrum, which indicated that the C-terminal peptides have been successfully labeled with the light and heavy arginine isotope. The feasibility and reliability of our dual-isotope strategy for Cterminus identification was investigated. Myoglobin and cytochrome C were used in this study as the model samples. According to our method, lysine would be completely formylated in the derivatization step, thus preventing the cleavage by trypsin. In this case, chymotrypsin was chosen as the enzyme for protein digestion in order to generate the C-terminal peptides with appropriate length for MS analysis. The chymotryptic digestions were desalted with Ziptip C18 before MALDI-TOF-MS analysis. The mass spectra of the digestion of myoglobin (a) and cytochrome C (d) are shown in Figure 2, and a summary of the peptide mass fingerprinting of myoglobin and cytochrome C was provided in Table S1. A pair of peaks with mass difference of 6 Da was detected in each mass spectrum, and the inset figures show an enlarged part of the dual peaks with equal signal intensity, 10748

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

Figure 2. MALDI mass spectra of standard proteins myoglobin (a−c) and cytochrome C (d−f). (a,d) Chymotryptic digestion of proteins after dualisotopic arginine labeling. Insets represent the enlarged pattern of paired peaks. (b,e) MS/MS spectra of light-labeled peptide. (c,f) Chymotryptic digestion of underivatized proteins. Insets represent the original C-terminal peptides. Asterisks represent the formylated amino groups. Pound represents the fragment ion of [Arg + H]+.

significantly improve the ionization efficiencies of C-terminal peptides because of the newly incorporating basic residue. It is known that the C-terminal peptides are difficult to detect sometimes, due to the lack of a basic amino acid, especially when being compared to those internal peptides derived from tryptic digestion. As shown in Figure 2, after chymotryptic digestion, the nonderivatized C-terminus of myoglobin (Figure 2c, Y.KELGFQG.-, m/z 778.41) and cytochrome C (Figure 2f, Y.LKKATNE.-, m/z 803.46) were almost invisible. The absolute intensity of each peak was only 183 (Y.KELGFQG.-) and 801 (Y.LKKATNE.-), which was insufficient for MS/MS sequencing. In contrast, a great enhancement of signal intensity for each Cterminal peptide has been observed after arginine labeling, as the absolute intensity was 1728 for Y.KELGFQG.- and 1357 for Y.LKKATNE.-. Third, our method is compatible with different proteases. Several approaches of C-terminal identification need a specific enzyme for protein digestion. For example, in the carboxypeptidase ladder sequencing approach, the internal

arginine-derivatized C-terminus of cytochrome C with two formylated lysines was identified. Several unique advantages of dual-isotopic arginine labeling for C-terminal peptide detection and sequencing are described. First, compared to other isotope coding strategies, our method represents a more clear and distinct pattern to determine the Cterminal peptide.26,27 Take the 50% 18O-labeling approach as an example; although the procedure may be shortened when using the 50% 18O-labeled digestion method, all the internal proteolytic peptides would be accompanied by their 18O-labeled counterparts, thus doubling the complexity of mass spectrum, which means that much more time is needed to interpret the spectra. Our method depicts simple and unique dual peaks for one protein to facilitate the recognition of its C-terminus. Moreover, the mass difference generated by isotopic arginine incorporation is 6 Da, which is a minor overlap for deconvolution, allowing accurate measurement of peptide identification. Second, the arginine-derivatized approach can 10749

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

Figure 3. Linearity of arginine-labeled quantitation for three standard peptides (a−c) and two standard proteins (d,e). Ratios are plotted by their theoretical values on the x-axis and their measured values on the y-axis. (a) SASLHLPK, (b) VVLQSKELLNSIGFS, (c) LSPIYNLVPVK, (d) KELGFQG from myoglobin, and (e) LKKATNE from cytochrome C.

reproducibility, which are the prerequisites for accurate relative C-terminus quantification. Identification and Quantification of C-Termini from Thermoanaerobacter tengcongensis Using Dual-Isotopic Arginine Labeling. Thermoanaerobacter tengcongensis (TTE), a thermophilic bacterium which was first discovered in China, has its own temperature-dependent protein expression characteristics.30 We examined the fold changes of C-termini between the TTE, which were cultured under different temperatures, 55 and 75 °C, respectively, to evaluate the practicality of dual-isotopic arginine labeling strategy in complex biological samples. TTE55 was labeled with the light arginine isotope at the protein level, while TTE75 was treated with the heavy isotope. Two types of samples were mixed at an equal ratio and digested with trypsin for MS analysis. As lysines have been blocked by formylation completely, Arg-C cleavage was set up for database searching, and one missed cleavage was allowed. C-terminal peptides could be identified by searching against the TTE database from NCBI and validated by the pattern of isotopic pair of peaks in MS spectra manually. The relative quantitation of C-termini was realized by the average ratio of the isotope pairs. A total of 68 C-terminal peptides were identified confidently in three replicate analyses. Fifty-three of them were detected from both TTE55 and TTE75, while 12 C-termini were found in TTE55 and 3 C-termini in TTE75, respectively (Table S2 in Supporting Information). By comparison, TTE was identified using the traditional strategy. The protein extraction was mixed by TTE55 and TTE75 in a ratio of 1:1, then reduced, alkylated, and digested by trypsin overnight. After the samples were desalted using the C18 Ziptip, the peptide digestion was subjected to LC-ESI-MS/MS analysis. As a result, only 36 Cterminal peptides have been identified. A sharp increase of Ctermini identification was achieved by introducing arginine labeling. The newly introduced basic residues largely enhanced the intensity of the response signal, which aided the identification of C-terminal peptides. Moreover, the oxazolone-based strategy seems universal and unbiased and was compatible with almost all amino acids on protein C-termini. Among the 68 C-terminal peptides, 17 different types of amino acid were identified as C-

fragments ending with homoserine lactone, which were restricted from carboxypeptidase degradation, should be generated by cyanogens bromide (CNBr).12 The strategy of Lys-C digestion and amine capture relies on Lys-C to provide lysine-containing internal peptides.15 Comparisons between LysC and Lys-N digestion also require a combination of Lys-C and Lys-N proteolysis.17 In our method, a proper protease could be chosen to obtain C-terminal peptide with an appropriate length for sequencing. Relative C-Terminus Quantification by Dual-Isotopic Arginine Labeling. The relative quantification focused on Cterminal peptides of proteins was also addressed by using our strategy, thus the expression level of proteins in different situations could be reflected. In this case, the same proteins from different samples were first labeled with a light or heavy arginine isotope. After terminating the reaction by evaporation, samples were mixed together and then digested by protease. The Cterminal peptides could be distinguished in the MS analysis, and the relative quantification could be achieved by comparing their ion abundance. The linearity and reproducibility of dual-isotopic labeling for relative quantitation of C-terminal peptides were evaluated using three standard peptides (SASLHLPK, VVLQSKELLNSIGFS, and LSPIYNLVPVK) and two proteins (myoglobin and cytochrome C) as the model samples. The 0Arg- and 6Arglabeled analytes were combined at various proportions of 1:10, 1:5, 1:4, 1:2, 1:1, 2:1, 4:1, 5:1, and 10:1 and then analyzed by MS with nine replicates for each ratio. For MALDI-MS analysis, the relative ratios were calculated on the basis of the signal intensities of monoisotopic peaks. As shown in Figure 3, the measured ratios of the light- and heavy-isotopic pairs were consistent with the expected values. The dual logarithmic plots between measured ratio and theoretical ratio represented a good linear correspondence in 2 orders of magnitude in the dynamic range, with the correlation coefficients (R2) higher than 0.99 and the coefficients of variation (CVs) ranging from 1.16 to 10.91%. These results demonstrated that the dual isotopic arginine labeling approach could provide a linear response and high 10750

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

Table 1. Changes of Expression Level of C-Terminal Peptides between TTE55 and TTE75 peptide sequence

charge state

m/z light

m/z heavy

ratio light/heavy

protein accession

ESDILAIIE DSVKSK KEDKKED KAPQFSKR SKKLKDFLD KEGIINN KLEGKF DYFMTAEEAKTYGIIDDILVRHKK

2 2 2 2 2 2 2 3

579.8287 438.2394 566.2656 587.3333 667.366 486.2699 467.2607 1032.873

582.8318 441.2499 569.2738 590.3317 670.3751 489.2698 470.261 1034.877

0.178 0.239 0.338 0.381 0.430 0.486 0.495 2.747

gi|20807075 gi|20808660 gi|20808302 gi|20808625 gi|20808174 gi|20807908 gi|20808162 gi|20807119

DFKQALDKV

2

638.3411

641.3619

2.795

gi|20808003

KKVLMELQNLLQ KSCVSR KSSLPSD QIALLPYTVE VKTTLPID KTVAKKKK NLAKSVTVE KLRD

2 2 2 2 2 2 2 2

834.9722 460.7548 459.2387 651.875 535.8134 613.8644 572.8233 358.2144

837.9786 463.7562 462.2474 654.885 538.8074 616.8747 575.8434 361.2234

2.889 2.996 3.649 4.177 4.814 4.877 4.928 6.355

gi|20808474 gi|20808925 gi|20807409 gi|20809108 gi|20808328 gi|20808635 gi|20808572 gi|20808885

KLGYAIDK CENIDLKSFDEVVDVGE EKVYATKG

2 2 3

560.3041 1076.513 369.8666

563.3214 1079.5386 371.8837

8.960 9.661 9.984

gi|20807454 gi|20808530 gi|20807010

protein description co-chaperonin GroES (HSP10) 50S ribosomal protein L3 hypothetical protein TTE1894 ribosomal protein S9 RNA polymerase sigma factor RpoD 30S ribosomal protein S16 GAF domain-containing protein ATP-dependent Clp protease proteolytic subunit cystathionine beta-lyase/ystathionine gammasynthase septum formation inhibitor-activating ATPase transcriptional regulator phosphatidylglycerophosphatase A 30S ribosomal protein S18 amino acid transporters 30S ribosomal protein S13 D-fructose-6-phosphate amidotransferase ABC-type multidrug transport system, ATPase component GTPase histidyl-tRNA synthetase flagellar protein

termini of protein were highly active and might play a crucial role in biological processes.17,32 Figure 4 shows a typical mass spectrum of the neo-C-terminal peptide. LILKKI was identified as a neo-C-terminal peptide of acetyltransferase (gi|20807976) from TTE. The peptide containing 2+ charges presented a pair of peaks with mass difference of 3 Da in MS spectrum (Shown in Figure 4a). Lysines were formylated, and the C-terminal amino acid isoleucine was labeled with light and heavy arginine at an equal ratio. The peptide sequence and derivatization was further determined by MS/MS analysis. Of the light- (Figure 4b) and heavy-labeled peptide (Figure 4c), both peptides had the same b type ions, and all the sequential y type ions presented a mass difference of 6 Da, illustrating that the arginine derivatization occurred at the C-terminus of peptide.

terminal amino acids. The absence of three missing amino acids, histidine, threonine, and tryptophan, was most likely due to their extremely low frequency of occurrence at the end of C-termini in the TTE proteome (1.04% for histidine, 1.55% for threonine, and 0.77% for tryptophan). Different regulated C-termini between TTE55 and TTE75 were distinguished using a stringent filter cutoff. According to the previous literature,31 the C-terminal peptides with ratio of 2 were considered as significant changes. Using this criterion, of 53 quantified C-terminal peptides, 16 C-termini were detected as up-regulated in TTE55, whereas 7 C-termini were determined as up-regulated in TTE75 (shown in Table 1). Among these differentially expressed C-termini, the majority of proteins were related to protease or ribosomal protein. In a way, it indicated that temperature was one of the factors to mediate the biological activity of TTE. Neo-C-termini, which were generated from mature proteins by protease truncation, can be distinguished and validated using our approach. Once a protein was degraded endogenously, a neoC-termini would be generated. The neo-C-termini had a free αcarboxyl group so that it had a chance to react with the isotope labeling reagent. With the dual-isotopic labeling, it is indicated that the neo-C-terminal peptide is part of endogenous proteolytic product of the sample rather than the result of cleavage during the sample pretreatment. The presence of the paired peaks can cut the risk of false positive identification of neoC-terminal peptides. To find the neo-C-terminus in TTE, semiArg-C cleavage was set up for database searching, while both light-labeled and heavy-labeled arginine on peptide C-termini were set as variable modifications. As a result, 24 neo-C-terminal peptides were identified confidently (Table S3 in Supporting Information). No neo-C-terminal peptides from three standard proteins were identified, indicating that there was no degradation process occurring during the protein extraction. Of these 24 neoC-terminal peptides, most were localized at the N-terminal or Cterminal region of full-length proteins, indicating that both



CONCLUSIONS A novel isotope labeling strategy has been presented for detecting and quantifying the C-termini of proteins. Using the oxazolone-based chemistry, a dual-isotopic arginine labeling was incorporated to the end of the C-terminus. An isotopic pair of peaks with a mass difference of 6 Da allowed us to recognize the C-terminal peptide, while the ratio between light and heavy labeling could be used to quantify the C-termini from different samples. The main advantage of our method is that arginine is specifically derivatized on the α-carboxyl group of the Cterminus, which gives a simple and straightforward pattern in the MS spectrum to interpret the C-terminal peptide. In addition, the reaction specificity was guaranteed. As a result, our approach has the capability to detect potential neo-C-terminal peptides, which may be generated during unknown proteolytic truncations. The paired peaks would guide us to determine the neo-C-terminus, which had not existed in the theoretical searching database. An appropriate amino reagent would also enhance the ionization efficiency and improve the de novo sequencing of the Cterminus. Moreover, for those short C-terminal peptides, the lengths of which were not appropriate for MS detection, the 10751

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

Figure 4. Mass spectra of a neo-C-terminus identified from Thermoanaerobacter tengcongensis (TTE). The neo-C-terminal peptide was labeled with the light and heavy isotope of arginine and presented a pair of peaks with mass difference of 3 Da in MS spectrum (a). MS/MS spectra of light- (b) and heavy-labeled peptides (c) indicated that the derivatization occurred at the C-terminus of peptide. Asterisks represent the formylated amino groups. Pound represents the fragment ion of [Arg + H]+.

Notes

arginine labeling could increase the molecular weight of parent ions in some degree. In general, our method showed its potential in large-scale proteomic research.



The authors declare no competing financial interest.

ASSOCIATED CONTENT

ACKNOWLEDGMENTS



REFERENCES

The work was supported by the National Science and Technology Key Project of China (2012CB910602, 2012AA020203, and 2010CB912700), the National Science Foundation of China (21025519, 21335002, and 31070732), and Shanghai Projects (Eastern Scholar and B109).

S Supporting Information *

Additional information as noted in the text. This material is available free of charge via the Internet at http://pubs.acs.org.





AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Fax: (086) 021-54237961. Tel.: (086) 021-54237618.

(1) Zhang, C. X.; Weber, B. V.; Thammavong, J.; Grover, T. A.; Wells, D. S. Anal. Chem. 2006, 78, 1636−1643.

10752

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753

Analytical Chemistry

Article

(2) Johnson, K. A.; Kari, P. F.; Tangarone, B. S.; Porter, T. J.; Rouse, J. C. Anal. Biochem. 2007, 360, 75−83. (3) Selkoe, D. J. Trends Cell Biol. 1998, 8, 447−453. (4) Gould, S. J.; Keller, G. A.; Hosken, N.; Wilkinson, J.; Subramani, S. J. Cell Biol. 1989, 108, 1657−1664. (5) Schilling, O.; Barre, O.; Huesgen, P. F.; Overall, C. M. Nat. Methods 2010, 7, 508−511. (6) Murphy, C. M.; Fenselau, C. Anal. Chem. 1995, 67, 1644−1645. (7) Edman, P.; Begg, G. Eur. J. Biochem. 1967, 1, 80−91. (8) Boyd, V. L.; Bozzini, M.; Zon, G.; Noble, R. L.; Mattaliano, R. J. Anal. Biochem. 1992, 206, 344−352. (9) Samyn, B.; Hardemen, K.; Van der Eychen, J.; Van Beeumen, J. Anal. Chem. 2000, 72, 1389−1399. (10) Aebersold, R.; Mann, M. Nature 2003, 422, 198−207. (11) Reid, G. E.; McLuckey, S. A. J. Mass Spectrom. 2002, 37, 663−675. (12) Samyn, B.; Sergeant, K.; Chtanheira, P.; Faro, C.; Van Beeumen, J. Nat. Methods 2005, 2, 193−200. (13) Hamberg, A.; Kempka, M.; Sjodahl, J.; Roeraade, J.; Hult, K. Anal. Biochem. 2006, 357, 167−172. (14) Sechi, S.; Chait, B. T. Anal. Chem. 2000, 72, 3374−3378. (15) Kuyama, H.; Shima, K.; Sonomura, K.; Yamaguchi, M.; Ando, E.; Nishimura, O.; Tsunasawa, S. Proteomics 2008, 8, 1539−1550. (16) Xu, G.; Shin, S. B. Y.; Jaffrey, S. R. ACS Chem. Biol. 2011, 6, 1015− 1020. (17) Kishimoto, T.; Kondo, J.; Igarashi, T. T.; Tanaka, H. Proteomics 2011, 11, 485−489. (18) Yamaguchi, M.; Oka, M.; Nishida, K.; Ishida, M.; Hamazaki, A.; Kuyama, H.; Ando, E.; Okamura, T.; Ueyama, N.; Norioka, S.; Nishimura, O.; Tsunasawa, S.; Nakazawa, T. Anal. Chem. 2006, 78, 7861−7869. (19) Nakazawa, T.; Yamaguchi, M.; Nishida, K.; Kuyama, H.; Obama, T.; Ando, E.; Okamura, T.; Ueyama, N.; Tanaka, K.; Norioka, S. Rapid Commun. Mass Spectrom. 2004, 18, 799−807. (20) Nakajima, C.; Kuyama, H.; Nakazawa, T.; Nishimura, O. Anal. Bioanal. Chem. 2012, 404, 125−132. (21) Kim, J. S.; Shin, M.; Song, J. S.; An, S.; Kim, H. J. Anal. Biochem. 2011, 419, 211−216. (22) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994−999. (23) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2001, 73, 2836−2842. (24) Kleifeld, O.; Doucet, A.; Keller, U.; Prudova, A.; Schilling, O.; Kainthan, R. K.; Starr, A. E.; Foster, L. J.; Kizhakkedathu, J. N.; Overall, C. M. Nat. Biotechnol. 2010, 28, 281−288. (25) Li, X.; Foley, E. A.; Molloy, K. R.; Li, Y.; Chait, B. T.; Kapoor, T. M. J. Am. Chem. Soc. 2012, 134, 1982−1985. (26) Kosaka, T.; Takazawa, T.; Nakamura, T. Anal. Chem. 2000, 72, 1179−1185. (27) Julka, S.; Dielman, D.; Young, S. A. J. Chromatogr., B 2008, 874, 101−110. (28) Wang, J.; Xue, Y.; Feng, X.; Li, X.; Wang, H.; Li, W.; Zhao, C.; Cheng, X.; Ma, Y.; Zhou, P.; Yin, J.; Bhatnagar, A.; Wang, R.; Liu, S. Proteomics 2004, 4, 136−150. (29) Sheehan, J. C.; Yang, D.-D. H. J. Am. Chem. Soc. 1958, 80, 1154− 1158. (30) Chen, Z.; Wang, Q.; Lin, L.; Tang, Q.; Edwards, J. L.; Li, S.; Liu, S. Anal. Chem. 2012, 84, 2908−2915. (31) Song, C.; Wang, F.; Ye, M.; Cheng, K.; Chen, R.; Zhu, J.; Tan, Y.; Wang, H.; Figeys, D.; Zou, H. Anal. Chem. 2011, 83, 7755−7762. (32) Kleifeld, O.; Doucet, A.; Keller, U.; Prudova, A.; Schilling, O.; Kainthan, R. K.; Starr, A. E.; Foster, L. J.; Kizhakkedathu, J. N.; Overall, C. M. Nat. Biotechnol. 2010, 28, 281−288.

10753

dx.doi.org/10.1021/ac401647m | Anal. Chem. 2013, 85, 10745−10753