Assignment of Disulfide-Linked Peptides Using Automatic a1 Ion

Nov 7, 2008 - Life Science Business Unit and Computer Integrated Manufacturing ... protein production.2,3 Conventional methods such as NMR4,5 or...
0 downloads 0 Views 1MB Size
Anal. Chem. 2008, 80, 9135–9140

Assignment of Disulfide-Linked Peptides Using Automatic a1 Ion Recognition Sheng Yu Huang,*,† Chien Hsien Wen,‡ Ding Tzai Li,† Jue Liang Hsu,† Chinpan Chen,§ Fong Ku Shi,† and Yueh Yi Lin‡ Life Science Business Unit and Computer Integrated Manufacturing Business Unit, C Sun MFG. LTD., 7F.-9, No.79, Sec. 1, Sintai Fifth Road, Sijhih City, Taipei County 221, Taiwan, and Institute of Biomedical Science, Academia Sinica, No.128, Sec. 2, Academia Road, Nangang District, Taipei City 115, Taiwan We present a novel approach for the assignment of peptides containing disulfide linkages. Dimethyl labeling is introduced to generate labeled peptides which exhibit enhanced a1 ion signals during MS/MS fragmentation. For disulfide-linked peptides, multiple a1 ions can be observed due to multiple N-termini. This distinct feature allows sieving out the disulfide-linked peptides; meanwhile, the N-terminal amino acids can be identified. With such information, the number of possible peptide combinations involved in a disulfide bond dramatically narrows down. Furthermore, we developed a computational algorithm to perform target a1 ion screening followed by molecular weight matching of cysteine-containing peptides with specific amino acids at the N-termini. Once the protein sequence and the peak list from a LC-MS/MS survey scan of labeled peptides are imported, the identities of disulfide-linked peptides can be readily obtained. The presented approach is simple and straightforward, offering a valuable tool for protein structural characterization. Disulfide bond formation plays a critical role in stabilizing protein tertiary structure thus maintaining biological activity.1 It influences protein functions and infers the quality of recombinant protein production.2,3 Conventional methods such as NMR4,5 or Edman degradation6,7 are used to analyze disulfide bridges. However, relatively large samples are needed for both methods, and well-purified peptides after proteolytic digestion are required for Edman degradation. Modern mass spectrometry coupled with separation techniques has emerged as an informative tool for * Corresponding author. E-mail: [email protected]. Phone: +886-286981117. Fax: +886-2-77089110. † Life Science Business Unit, C Sun MFG. LTD. ‡ Computer Integrated Manufacturing Business Unit, C Sun MFG, LTD. § Academia Sinica. (1) Yano, H.; Kuroda, S.; Buchanan, B. B. Proteomics 2002, 2, 1090–1096. (2) Srebalus Barnes, C. A.; Lim, A. Mass Spectrom. Rev. 2007, 26, 370–388. (3) Baneyx, F.; Mujacic, M. Nat. Biotechnol. 2004, 22, 1399–1408. (4) von Ossowski, L.; Tossavainen, H.; von Ossowski, I.; Cai, C.; Aitio, O.; Fredriksson, K.; Permi, P.; Annila, A.; Keinanen, K. Biochemistry 2006, 45, 5567–5575. (5) Sharma, D.; Rajarathnam, K. J. Biomol. NMR 2000, 18, 165–171. (6) John, H.; Forssmann, W. G. Rapid Commun. Mass Spectrom. 2001, 15, 1222–1228. (7) Haniu, M.; Acklin, C.; Kenney, W. C.; Rohde, M. F. Int. J. Pept. Protein Res. 1994, 43, 81–86. 10.1021/ac8013725 CCC: $40.75  2008 American Chemical Society Published on Web 11/07/2008

disulfide bond analysis.8 Several approaches based on comparing the MS spectra of the protein digest before and after reduction were reported.9-11 Possible linkages are proposed with the molecular weight calculation of cysteinyl peptides with the reduction of 2 Da per disulfide bond, followed by Edman sequencing or MS/MS interpretation to confirm peptide sequences. Characterization of disulfide linkages is especially difficult for proteins containing multiple cysteines which bring numerous possibilities of disulfide bond arrangement. In addition to the accurate molecular weight, manual inspection of the MS/ MS fragmentation patterns of disulfide-linked peptides is still required for most approaches. Several strategies were presented for disulfide bond analysis through manipulating reduction/alkylation reactions.12-15 For instance, Qi et al. introduced the concept of “negative signature mass algorithm” using partial reduction and cyanylation to construct one perfect disulfide structure by ruling out enough linkages.13,16 Yen et al. proposed partial reduction followed by NEM (N-ethylmaleimide) alkylation, full reduction, and then IAM (iodoacetamide) alkylation to discriminate closely spaced cysteines involved in disulfide linkages.14 However, ambiguous results may be obtained when disulfide bonds have similar reduction rates.12 Increasing species in the analyte resulting from two-step alkylation may also complicate the mass spectra. The incorporation of stable isotope 18O into the C-termini of peptides was demonstrated to analyze peptides linked by interchain disulfide bonds.8,17,18 A distinguishable isotopic profile is generated for such peptides by using 50% H218O in H216O during enzymatic digestion. The method (8) Gorman, J. J.; Wallis, T. P.; Pitt, J. J. Mass Spectrom. Rev. 2002, 21, 183– 216. (9) Zhang, W.; Marzilli, L. A.; Rouse, J. C.; Czupryn, M. J. Anal. Biochem. 2002, 311, 1–9. (10) Mhatre, R.; Woodard, J.; Zeng, C. Rapid Commun. Mass Spectrom. 1999, 13, 2503–2510. (11) Fukuyama, Y.; Iwamoto, S.; Tanaka, K. J. Mass Spectrom. 2006, 41, 191– 201. (12) Jones, M. D.; Hunt, J.; Liu, J. L.; Patterson, S. D.; Kohno, T.; Lu, H. S. Biochemistry 1997, 36, 14914–14923. (13) Qi, J.; Wu, W.; Borges, C. R.; Hang, D.; Rupp, M.; Torng, E.; Watson, J. T. J. Am. Soc. Mass Spectrom. 2003, 14, 1032–1038. (14) Yen, T. Y.; Yan, H.; Macher, B. A. J. Mass Spectrom. 2002, 37, 15–30. (15) Schnaible, V.; Wefing, S.; Bucker, A.; Wolf-Kummeth, S.; Hoffmann, D. Anal. Chem. 2002, 74, 2386–2393. (16) Wu, W.; Huang, W.; Qi, J.; Chou, Y. T.; Torng, E.; Watson, J. T. J. Proteome Res. 2004, 3, 770–777. (17) Wallis, T. P.; Pitt, J. J.; Gorman, J. J. Protein Sci. 2001, 10, 2251–2271. (18) Rose, K.; Savoy, L. A.; Simona, M. G.; Offord, R. E.; Wingfield, P. Biochem. J. 1988, 250, 253–259.

Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

9135

utilizes the feature that multiple chains connected by disulfide bonds contain more than one C-terminus. One drawback is that the isotope distribution may complicate the interpretation of the mass spectra. There are also methods proposed with different mass spectrometry detection strategies such as the use of a reductive matrix for MALDI detection,11 the use of negative mode detection in which disulfide bond can be fragmented in a specific manner,19,20 and the use of ESI/FTMS for intact protein detection.21 For software refinement, a program named “SearchXLinks” was developed to recognize peptides linked by interchain disulfide bonds via analyzing their MALDI in-source decay spectra followed by the interpretation of their TOF-TOF spectra to confirm peptide sequences.22,23 More recently, the algorithm “MassMatrix” based on tandem MS data interpretation for the assignment of disulfide bond was reported.24 The algorithm is reinforced by a probability scoring model. However, limited information may be obtained for certain types of disulfide bonds because only the product ions from one-bond cleavage are taken into account. In this study, we demonstrated a mass spectrometry based strategy coupled with dimethyl labeling25 for the analysis of disulfide bridges. Peptide N-terminus and Lys residues are globally labeled with formaldehyde via reductive amination that causes a 28 Da mass difference (32 Da with formaldehyde-D2) for each derivatized site except 14 Da for proline (16 Da with formaldehyde-D2). The chemical reaction originally aimed at protein quantitation was reported to be fast, complete, and cheap.25 Moreover, dimethyl-labeled peptides exhibit abundant a1 ions during MS/MS fragmentation, which allow evident recognition of the amino acids at the peptide N-termini.26 Therefore, we applied dimethyl labeling to the analysis of disulfide-linked peptides produced from enzymatic digestion in nonreduced condition. For those peptides that usually contain multiple Ntermini, MS/MS spectra with multiple a1 ions were observed. We further developed a computational program named RADAR (rapid assignment of disulfide linkage via a1 ion recognition) to sieve out the spectra which consist of multiple/specific a1 ions from peak lists of a LC-MS/MS survey scan. According to the a1 identities, the software is able to search for a molecular weight combination match against the cysteine-containing peptide lists generated from the protein sequence. This approach dramatically reduces the number of MS/MS spectra that need to be examined, facilitating rapid and automatic disulfide bond assignment. EXPERIMENTAL SECTION Enzymatic Digestion and Dimethyl Labeling. Recombinant human pancreatitis-associated protein (hPAP) was prepared as (19) Zhang, M.; Kaltashov, I. A. Anal. Chem. 2006, 78, 4820–4829. (20) Chelius, D.; Huff Wimer, M. E.; Bondarenko, P. V. J. Am. Soc. Mass Spectrom. 2006, 17, 1590–1598. (21) Narayan, M.; Welker, E.; Zhai, H.; Han, X.; Xu, G.; McLafferty, F. W.; Scheraga, H. A. Nat. Biotechnol. 2008, 26, 427–429. (22) Schnaible, V.; Wefing, S.; Resemann, A.; Suckau, D.; Bucker, A.; WolfKummeth, S.; Hoffmann, D. Anal. Chem. 2002, 74, 4980–4988. (23) Wefing, S.; Schnaible, V.; Hoffmann, D. Anal. Chem. 2006, 78, 1235–1241. (24) Xu, H.; Zhang, L.; Freitas, M. A. J. Proteome Res. 2008, 7, 138–144. (25) Hsu, J. L.; Huang, S. Y.; Chow, N. H.; Chen, S. H. Anal. Chem. 2003, 75, 6843–6852. (26) Hsu, J. L.; Huang, S. Y.; Shiea, J. T.; Huang, W. Y.; Chen, S. H. J. Proteome Res. 2005, 4, 101–108.

9136

Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

Figure 1. The use of the a1 tags for the assignment of disulfidelinked peptides. The enhanced a1 ions indicate the N-terminal amino acid of each chain.

previously described.27 Other standard proteins mentioned in the text were purchased from Sigma (St. Louise, MO). NEM at 5 mM concentration (Sigma) in 100 mM sodium acetate (J. T. Baker, Phillipsburg, NJ), pH 6, was used to block free cysteines at room temperature for 30 min. Enzymatic digestion was performed directly in sodium acetate at 37 °C overnight with 1:50 trypsin (Promega, Madison, WI). Protein digest was diluted three times with 100 mM sodium acetate (pH 5) before dimethyl labeling. To perform dimethyl labeling, 2.5 µL of 4% (w/v) formaldehydeH2 (J. T. Baker) or 2.5 µL of 4% (w/v) formaldehyde-D2 (Aldrich) was added to 50 µL of protein digest followed by the addition of 2.5 µL of 600 mM sodium cyanoborohydride (Sigma), and the reaction was performed at pH 5-6 for 30 min. Special caution was taken including the use of surgical gloves and a fume hood when formaldehyde was handled. Mass Spectrometry. ESI Q-TOF equipped with a CapLC system (Waters, Milford, MA,) utilizing a capillary column (75 µm i.d., 10 cm in length, Csun, Taiwan) was used to perform the survey scan (MS, m/z 400-1600; MS/MS, m/z 50-2000). The alkylated and dimethyl-labeled protein digest was subject to LC-MS/MS analysis with a linear gradient from 5% to 50% acetonitrile containing 0.1% formic acid over 45 min. Data Analysis. MassLynx 4.0 was used to produce peak lists from raw data (subtract 30%, smooth 3/2 Savitzky Golay and center three channels 80% centroid). A relatively high subtraction can be applied here to eliminate background noise. True a1 ions usually appear as major peaks so that they can be kept in the peak list. RESULTS AND DISCUSSION MS/MS Spectra of Dimethyl-Labeled, Disulfide-Linked Peptides. Dimethyl-labeled peptides were found to have enhanced a1 ion signals during MS/MS fragmentation, and the N-terminal amino acids can be explicitly identified.26 We take this advantage for the analysis of peptides linked by interchain disulfide bonds which contain more than one N-terminus. As indicated in Figure 1, two peptide chains linked by a disulfide bond exhibit the enhanced a1 ion signals corresponding to two N-terminal amino acids during MS/MS fragmentation. Three distinct a1 ions are generated if three peptide chains are linked by disulfide bonds and so forth. Peptides containing intrachain disulfide bonds or peptide chains with the same N-terminal amino acid, for which only one specific a1 ion can be observed, will be discussed later in the text. Human pancreatitis-associated protein (hPAP) has been identified in pathognomonic lesions of Alzheimer, and its structure, three disulfide bonds contributed from six cysteines, was solved by NMR analysis.27 Recombinant hPAP was used as a simple model to demonstrate the concept of our strategy. In (27) Ho, M. R.; Lou, Y. C.; Lin, W. C.; Lyu, P. C.; Huang, W. N.; Chen, C. J. Biol. Chem. 2006, 281, 33566–33576.

Figure 2. The MS/MS spectrum of formaldehyde-H2 labeled, disulfide-linked peptide m/z 517.9 (4+) from hPAP (*LPYVC*K, *SWTDADLACQ*K). The asterisks denote labeling sites. Capital letters A/B/Y denote the fragments of the upper peptide. Ynyn indicates that the two fragments are linked by a disulfide bond. Signals in the mass range above 700 are enhanced by 6 times to make a clear annotation. Two distinct a1 ions can be found.

our experiment, hPAP was first treated with NEM to block free cysteines. To prevent the disulfide bond scrambling, trypsin digestion was performed directly in sodium acetate buffer (pH 6) which is compatible with the following dimethyl labeling. After digestion overnight, sodium acetate (pH 5) was used to dilute the sample followed by the addition of labeling reagents. The labeled peptide mixture was analyzed by LC-MS/MS. Figure 2 shows the CID spectrum of m/z 517.9 (4+), the peptide pair “LPYVCK, SWTDADLACQK” linked by an interchain disulfide bond. In addition to the molecular weight match, the two distinct a1 ions, 88.07 and 114.13, indicate Ser and Leu at the two N-termini, respectively. The rest of the b and y ions further confirmed the assignment of the peptide sequences. The exact masses of each a1 ion for 20 amino acids labeled with formaldehyde-H2 or foramaldehyde-D2 are listed in Table S-1 in the Supporting Information. It is noted that cysteines with a free thiol group generate the a1 ion 104.05 in the CID spectrum while cysteines involved in disulfide linkages generate a significantly different a1 ion of 102.04 (distinguishable from the a1 of threonine 102.09) because the free thiol groups transform to a thiolaldehyde moiety. As shown in Figure 3, the CID spectrum of m/z 671.6 (3+) exhibits two distinct a1 ions 72.08 and 102.04, which suggest Ala and oxidized cysteine are at the two N-termini. Similarly, three peptide chains linked by two disulfide bonds exhibit three enhanced a1 ions as illustrated with a BSA tryptic peptide labeled by formaldehyde-D2 shown in Figure S-1 in the Supporting Information. Three a1 ions 92.10, 134.12, and 168.13 indicate that the three N-terminal amino acids are Ser, Glu, and Tyr, respectively. However, it requires further analysis to distinguish between the two adjacent cysteines. The reducing agent used in the reductive amination for dimethyl labeling was sodium cyanoborohydride which provides milder reducing power; therefore, no

detectable disulfide bond disruption was observed under this mild condition (data not shown). Automatic Assignment of Disulfide-Linked Peptides. On the basis of the proposed concept, we developed the program named RADAR to automatically perform a1 ion screening as well as molecular weight (MW) matching in order to efficiently map disulfide linked peptides. Figure 4 depicts the work principle of RADAR. Protein sequences are imported and digested according to the selected amino acids to generate the peptide list. The peptides without cysteines are discarded whereas the remaining peptides are added with 28 or 32 Da at each derivatized sites for dimethyl labeling (Lys and N-terminus). Their corresponding MWs are listed as the database for the following search. The N-terminal amino acids of these peptides provide the “target a1 ions”. Target a1 ion screening is performed on the peak list generated from the LC-MS/MS survey scan of labeled peptides. Precursor ions are retained if any of the target a1 ions are observed in its CID spectra; otherwise the precursor ion is discarded. The m/z of these remaining precursor ions are used to search against the peptide lists for MW combination match. A reduction of 2 Da for 2 cysteines and a 4 Da reduction for 4 cysteines, etc. will be considered during the calculation. Figure 5 shows the layout of RADAR as well as the analysis result of hPAP. Trypsin digestion of hPAP generates six cysteinecontaining peptides with six different N-terminal amino acids as indicated in the right table. After target a1 ion screening, the precursor ions in the peak list were filtered with the six a1 ions followed by a MW combination match. For example, two a1 ions for Ser and Leu were observed in the MS/MS spectrum of m/z 690.3 (3+). RADAR will automatically calculate the MW combination of all peptides with Ser or Leu at N-terminus. Once a match is obtained, the result is indicated in the table underneath the Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

9137

Figure 3. The MS/MS spectrum of formaldehyde-H2 labeled, disulfide-linked peptide m/z 671.6 (3+) from hPAP (*AYGSHCYALFLSP*K, *CP*K). The asterisks denote labeling sites. Capital letters A/B/Y denote the fragments of the upper peptide. Ynyn indicates that the two fragments are linked by a disulfide bond. Two distinct a1 ions can be found.

Figure 4. The work principle of RADAR.

column “peptide pair”. The disulfide-linked peptide pair m/z 690.3 (3+) was designated to be “SWTDADLACQK, LPYVCK” with the a1 match plus MW match. Without prior knowledge of the connection between these cysteines, three pairs of peptides for hPAP were assigned by RADAR (five spectra due to different charge states), implying one perfect disulfide structure of hPAP. The result was consistent with the previous study on hPAP structure using NMR.27 Enzymatic digestion of nonreduced proteins with disulfide linkage is occasionally accompanied with missed cleavages. The number of possible disulfide bond arrangements exponentially increases as the number of cysteinyl peptides increases. The automation makes the task easier: If one missed cleavage is taken into account, one more peptide pair with N-terminal amino acids “Asn (N),Trp (W)” can be found readily as shown in Figure S-2 in the Supporting Information, suggesting the disulfide bond is between Cys108 and Cys125, which agrees with the above result of no missed cleavages. Moreover, dimethyl labeling coupled with RADAR analysis was also applied to several 9138

Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

other proteins such as lysozyme. All three peptide pairs including four disulfide bonds can be assigned by the software with an a1 match as well as MW match. The result is shown in Figure S-3 in the Supporting Information. The algorithm of RADAR was developed not to solely consider precursors with “multiple a1 ions”. RADAR also calculates those spectra with a single a1 ion which matches to one of the N-terminal amino acid of the listed cysteinyl peptides in order to analyze the following three kinds of disulfide-linked peptides: (1) Peptides containing intrachain disulfide bonds. For such peptides, the algorithm can search against the peptide list for those containing the specific N-terminal amino acid and two cysteines. Via the explicit a1 information with our strategy, the peptide ”TCVADESHAGCEK” from BSA can be easily detected as the MW of a 2 Da reduction due to the intrachain disulfide bond. On the contrary, it may be difficult using 18O or ISD detection22 because no distinct isotopic profile or ISD fragments can be observed. (2) Two peptide chains with the same N-terminal amino acids are linked by disulfide bonds. This can be solved by the algorithm because all possible arrangements will be considered as long as the two peptides are on the cysteinyl peptide list. (3) Two identical peptides linked by a disulfide bond. It occurs when a protein contains intermolecular disulfide bonds such as the hinge region of immunoglobulin. RADAR was also designed to take this into consideration as well. Optimization of a1 Recognition. During raw data processing, a relatively high subtraction rate (>30%) can theoretically eliminate noise signals at low m/z range to obtain an efficient a1 recognition since the a1 ions are remarkably enhanced. The

Figure 5. The layout of RADAR to demonstrate the disulfide bond analysis of hPAP.

mass tolerance of a1 recognition can be adjusted on RADAR, too. With a tolerance of ±0.02 Da or lower, a fair result can be obtained after appropriate calibration with the Q-TOF instrument used here. Except for Ile and Leu, ±0.02 Da can distinguish all a1 ions of 20 amino acids without an overlap when formaldehyde-D2 is used. A small overlap occurs between Lys (157.1705) and Arg (157.1453) when formaldehyde-H2 is used; thus, a smaller window (e.g., ±0.01 Da) is suggested. It also will not be a problem for trypsin digestion which generates peptides with Lys/Arg at the C-termini. As to Ile and Leu, two possibilities will both be considered by RADAR if cysteinyl peptides with the two N-termini both exist. Another parameter is the signal threshold. The precursor ion extraction may sometimes include other ions due to rough resolution; therefore, a threshold has to be set to eliminate those a1 ions from other peptides that were not removed by subtraction. We define the default setting as follows: if a specific a1 signal is 5 times lower than the strongest a1, the specific a1 ion is removed. For complicated samples which contain multiple disulfide bonds within more than two peptides, a higher value (lower threshold) can be chosen to avoid false negative. Those factors mentioned above are designed to restrict the a1 ion screening and make sure that limited and accurate a1 ions are selected instead of

redundant and false positive ones. They can be enlarged to extend the search area if no peptide pairs can be found. For molecular weight match tolerance, we used 0.2 Da for hPAP analysis. However, in some cases, higher tolerance can be used to include 13C peaks which are sometimes chosen for MS/MS rather than the 12C peaks, especially for those peptides with higher molecular weights. Furthermore, the upper limit of peptide chain numbers for the MW combination test can be adjusted to obtain an efficient search. It depends on the sample complexity and should be increased when multiple cysteines are found in one single enzymatic peptide. CONCLUSION The experimental design proposed here is simple and straightforward. Nonreduced proteins can be alkylated, digested, and labeled in one pot without any buffer exchange or desalting step. Sodium acetate buffer was used throughout the whole process in order to control the pH value to prevent from disulfide bond shuffling. The condition is also compatible with dimethyl labeling which is known to be interfered with by amine-containing buffer. Compared to the use of stepwise alkylation or 18O isotope, the products from dimethyl labeling are simpler because the reaction is nearly complete.25 Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

9139

The unambiguous information of N-terminal amino acids provided by the a1 ions facilitates sieving out the peptides of interest since the spectra with unwanted a1 ions are discarded. It also increases the confidence of MS/MS interpretation. Moreover, the detection of a1 ions is achieved in the small m/z region and is only related to N-terminal residues. Therefore, it is independent of the length of peptide chains, unlike several present approaches that require the positive detection of each peptide chain from disulfide bond breakage. As the algorithm is designed to try all possibilities based on the designated a1 (one or more) for each precursor m/z, it can basically cover wide types of disulfide-linked peptides. The proposed strategy

9140

Analytical Chemistry, Vol. 80, No. 23, December 1, 2008

aims at the automatic, rapid assignment of disulfide bonds and serves as a useful tool for routine protein structural analysis. The software RADAR for automatic disulfide bond assignment will be open at http://www.psti.com.tw for free trial in the near future. SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org. Received for review July 3, 2008. Accepted October 7, 2008. AC8013725