Pro-CrossLink. Software Tool for Protein Cross-Linking and Mass

ACS eBooks; C&EN Global Enterprise. A; Accounts of Chemical Research · ACS Applied Bio Materials - New in 2018 · ACS Applied Energy .... Software Tool...
0 downloads 0 Views 343KB Size
Anal. Chem. 2006, 78, 2145-2149

Pro-CrossLink. Software Tool for Protein Cross-Linking and Mass Spectrometry Qiuxia Gao,†,‡ Song Xue,‡,§ Catalin E. Doneanu,† Scott A. Shaffer,† David R. Goodlett,† and Sidney D. Nelson*,†

Department of Medicinal Chemistry, University of Washington, Box 357610, Seattle, Washington 98195, and Microsoft Corporation, One Microsoft Way, Redmond, Washington 98052

To facilitate structural analysis of proteins and proteinprotein interactions, we developed Pro-CrossLink, a suite of software tools consisting of three programs (Figure 1), DetectShift, IdentifyXLink, and AssignXLink. DetectShift was developed to detect ions of cross-linked peptide pairs in a mixture of 18O-labeled peptides obtained from protein proteolytic digests. The selected candidate ions of crosslinked peptide pairs subsequently undergo tandem mass spectrometric (MS/MS) analysis for sequence determination. Based on the masses of candidate ions as well as y- and b-type ions in the tandem mass spectra, IdentifyXLink assigns the candidate ions to cross-linked peptide pairs. For an identified cross-linked peptide pair, AssignXLink generates an extensive fragment ion list, including a-, b-, c-type, x-, y-, z-type, internal, and immonium ions with associated common losses of H2O, NH3, CO, and CO2, and facilitates a precise location of the cross-linked residues. Pro-CrossLink is automated, highly configurable by the user, and applicable to many studies that map low-resolution protein structures and molecular interfaces in protein complexes. Chemical cross-linking in combination with mass spectrometry has developed into a powerful method for mapping low-resolution, three-dimensional protein structures and for investigating molecular interfaces in protein complexes. Despite the excellence of mass spectrometry as an analytical tool,1-4 the identification of protein cross-linked products is significantly hampered by the inherent complexity of cross-linking reaction mixtures. Among all the proposed strategies2 to identify cross-linked peptide pairs in a complex mixture of mainly non-cross-linked peptides, 18O-labeling5-10 is an attractive one because it is suitable * To whom correspondence should be addressed. Telephone: (206) 543-1419. Fax: (206) 685-3252. E-mail: [email protected]. † University of Washington. ‡ These authors contributed equally to this work. § Microsoft Corp. (1) Tang, X.; Munske, G. R.; Siems, W. F.; Bruce, J. E. Anal. Chem. 2005, 77, 311-8. (2) Sinz, A. J. Mass Spectrom. 2003, 38, 1225-37. (3) Young, M. M.; Tang, N.; Hempel, J. C.; Oshiro, C. M.; Taylor, E. W.; Kuntz, I. D.; Gibson, B. W.; Dollinger, G. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 5802-6. (4) Back, J. W.; de Jong, L.; Muijsers, A. O.; de Koster, C. G. J. Mol. Biol. 2003, 331, 303-13. (5) Schnolzer, M.; Jedrzejewski, P.; Lehmann, W. D. Electrophoresis 1996, 17, 945-53. 10.1021/ac051339c CCC: $33.50 Published on Web 03/08/2006

© 2006 American Chemical Society

for all cross-linking reactions and, once optimized for maximum incorporation, is easy to conduct. However, manual analysis of the large set of mass spectrometric data generated in 18O-labeling studies is not feasible, particularly because the number of 18O atoms incorporated into each peptide has to be calculated. If MS data are acquired from an electrospray ionization (ESI)-based mass spectrometer, candidate selection is more complex, because not only do peptide ions exist in multiple charge states but the values of monoisotopic peak shifts represent different peptide mass increases depending on the charge states of the ions. A further complication to data interpretation occurs when a flow splitter is used prior to a chromatographic column to achieve a nano-LC flow rate. A split-flow system does not always provide a constant flow rate to the nano-LC column due to back-pressure differences between the nano-LC column and the splitter that change as the organic gradient is formed. Under such experimental conditions, shifts in retention time may be observed for different chromatographic runs. Therefore, a precursor ion in the 16O-digest may not appear in the same MS scan as the corresponding one in the 18O-digest even though the chromatographic properties of 16Oand 18O-labeled peptides are identical.7 As a result, it can be difficult to manually locate signals and calculate mass shifts for a given peptide between the 16O- and 18O-digests. To address these issues and to speed up data analysis, we developed the program DetectShift to select candidate ions of cross-linked peptide pairs incorporating more than two 18O atoms. The candidate ions of cross-linked peptide pairs selected by DetectShift subsequently underwent tandem mass analysis for sequence determination. We developed the program IdentifyXLink to instantly assign the targeted tandem mass spectra to crosslinked peptide pairs based on y- and b-type ions in the spectra. For an identified cross-linked peptide pair, its entire set of fragment ions is assigned and cross-linked sites are located by AssignXLink, the third program in the software package. Here we demonstrate the use and discuss the properties of Pro-CrossLink by going through the analysis of a set of mass (6) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2001, 73, 2836-42. (7) Yao, X.; Afonso, C.; Fenselau, C. J. Proteome Res. 2003, 2, 147-52. (8) Back, J. W.; Notenboom, V.; de Koning, L. J.; Muijsers, A. O.; Sixma, T. K.; de Koster, C. G.; de Jong, L. Anal. Chem. 2002, 74, 4417-22. (9) Collins, C. J.; Schilling, B.; Young, M.; Dollinger, G.; Guy, R. K. Bioorg. Med. Chem. Lett. 2003, 13, 4023-6. (10) Huang, B. X.; Kim, H. Y.; Dass, C. J. Am. Soc. Mass Spectrom. 2004, 15, 1237-47.

Analytical Chemistry, Vol. 78, No. 7, April 1, 2006 2145

spectrometric data that led to the identification of a cross-linked peptide pair in a human cytochrome P450 2E1-human cytochrome b5 (CYP2E1-b5) complex that is important in cytochrome P450 structural and functional studies. EXPERIMENTAL SECTION Materials. Cross-linking reagent 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC) was purchased from Pierce Biotechnology, Inc. (Rockford, IL). Sequencing grade modified trypsin was from Roche Applied Science (Indianapolis, IN). 18O-Labeled water (99+% atom 18O) was purchased from Isotec (Miamisburg, OH). Coomassie brilliant blue R, TFA, dithiothreitol, and iodoacetamide were purchased from SigmaAldrich (St. Louis, MO). HPLC solvents were of the highest grade commercially available and were used as received. All other reagents were analytical grade. Cross-Linking Reaction and Proteolytic Digestion. All enzymes used for cross-linking reactions were dialyzed against dialysis buffer A (50 mM KPi, pH 7.4, containing 20% glycerol). CYP2E1 and b5 were mixed by gently stirring for 10 min at room temperature, and then the solution was held at room temperature for 2 h. EDC was added to 8 mM final concentration from a 100 mM stock solution. The reaction was allowed to proceed at room temperature for 2 h. The cross-linking reaction was quenched by removal of EDC through dialysis against dialysis buffer A. Glycerol was removed by a second dialysis against dialysis buffer B (50 mM potassium phosphate buffer, pH 7.4). The sample was dried completely in a Speed Vac and resuspended in 6 M urea, 100 mM Tris-Base, pH 8.0, and the protein concentration was adjusted to ∼ 2 mg/mL. A 50-µL aliquot of the protein sample was transferred to another microcentrifuge tube and reduced by adding 10 µL of 100 mM Tris-base, pH 8.0, containing 100 mM DTT. The reaction was carried out at room temperature for 1 h. Subsequent alkylation reaction was initiated by adding 30 µL of 100 mM Tris-base, pH 8.0, containing 500 mM iodoacetamide. The reaction was allowed to proceed in the dark at room temperature for 1 h. The sample was subsequently diluted in 50 mM ammonium bicarbonate and centrifuged using an Ultrafree-4 Centrifugal Filter Unit (Millipore, Billerica, MA). The dilution and centrifugation steps were repeated three times. The sample was then split into two equal aliquots, which were dried in the Speed Vac to complete dryness. The two dried peptide samples were reconstituted in two digestion solutions (50 mM ammonium bicarbonate, pH 8.5, with the addition of sequencing grade trypsin at an enzyme/substrate ratio of 1:25 (w/w), prepared with 16O- and 18O-water, respectively). The digestion was allowed to proceed at 37 °C for 24 h, and the reaction was quenched with 0.1% TFA. Mass Spectrometric Analysis. Peptide digests were analyzed by on-line nano-LC/ESI tandem mass spectrometry (MS/MS) using a quadrupole time-of-flight (QTOF) mass spectrometer equipped with a CapLC system (Waters, Milford, MA) and by ESI in the positive ion mode on a hybrid ion trap-Fourier transform ion cyclotron resonance (IT-FT-ICR) mass spectrometer (Thermo Electron Corp., San Jose, CA). The instrument settings and experimental conditions are described in the Supporting Information. 2146

Analytical Chemistry, Vol. 78, No. 7, April 1, 2006

PROGRAM DEVELOPMENT Development of Program DetectShift. DetectShift-I and DetectShift-II were developed to analyze multiply charged peptide ions obtained from ESI-MS and singly charged peptide ions from MALDI-MS, respectively. DetectShift selects peptides incorporating a user-specified number of 18O atoms. The original *.raw files acquired with the MassLynx software (Micromass, Cambridge, U.K.) are converted from profile data to centroid data using the MassLynx Accurate Mass Measure function, and all the centroid data are exported into *.txt files using the MassLynx DataBridge function. Two types of signal intensity threshold can be specified: (1) an absolute intensity threshold for all signals in all scans and (2) a percentage value, which is multiplied by the base peak intensity in a scan to yield the intensity threshold for all signals in that scan. Because of the aforementioned common retention time shift problem that can occur between different chromatographic runs even of the same sample, signals of the 16O-digest are compared to those of the 18O-digest within a certain retention time window, which can be specified by either minutes or MS scans. Development of Program IdentifyXLink. IdentifyXLink identifies an inter- or intramolecular cross-linked peptide pair based on two criteria: (1) precursor ion mass and (2) tandem mass spectrum of the peptide. IdentifyXLink first builds a crosslinked peptide pair database based on the following: (1) input of known protein sequences, (2) specified cross-linked residue types, and (3) the mass adjustment resulting from the cross-linking reaction. IdentifyXLink reads the monoisotopic peak and charge state of the precursor ion selected by DetectShift and matches the precursor ion to those in the cross-linked peptide pair database within a user-specified error range. The user can either end the identification process and output all such matches or proceed to further narrow down the results by matching the tandem mass spectrum. IdentifyXLink generates a database of terminal fragment ions (y and b ions) for each crosslinked peptide pair resulting from precursor ion mass matching. The peaks in the tandem mass spectrum are matched to the fragment ion database. If the percentage of the matched fragment ions to total terminal fragment ions exceeds a user-specified value, the cross-linked peptide pair is output in the result window. Depending on the number of the recorded fragment ions in the tandem mass spectrum, the suggested range for the percentage value is between 20 and 50%. If there is more than one result, IdentifyXLink sorts the results in the order of decreasing probability score. A peptide has higher probability score if a higher percentage of fragment ions are matched. Development of Program AssignXLink. For a given crosslinked peptide pair, AssignXLink generates an extensive fragment ion list and conducts tandem mass spectral assignments. AssignXLink follows the nomenclature of fragment ions proposed recently for protein cross-linking.12 AssignXLink generates a-, b-, c-type, x-, y-, z-type, internal, and immonium ions with associated common losses of H2O, NH3, CO, and CO2. Both single and double cleavages (fragmentation of a peptide backbone at a single site (11) Kaji, H.; Saito, H.; Yamauchi, Y.; Shinkawa, T.; Taoka, M.; Hirabayashi, J.; Kasai, K.; Takahashi, N.; Isobe, T. Nat. Biotechnol. 2003, 21, 66772. (12) Schilling, B.; Row, R. H.; Gibson, B. W.; Guo, X.; Young, M. M. J. Am. Soc. Mass Spectrom. 2003, 14, 834-50.

Figure 1. Flowchart of the software package Pro-CrossLink.

and two sites, respectively) on “H-shaped” cross-linked peptide pairs are considered. RESULTS AND DISCUSSION Deciphering Cross-Links of CYP2E1 and b5. The protein molecular masses of CYP2E1 and b5 are approximately 55 and 16 kDa, respectively. Judged by Coomassie Blue staining, treatment of an equimolar mixture of CYP2E1 and b5 with EDC resulted in the formation of one major product, a CYP2E1-b5 complex, with a molecular mass of ∼70 kDa (lane 2 of Figure S-1, Supporting Information). This complex was absent in the control sample containing the protein mixture without EDC (lane 3 of Figure S-1). The presence of both CYP2E1 and b5 polypeptide chains in the complex was verified by in-gel digestion, LC-MS/MS analysis and database searching with MASCOT (Matrix Science, London, U.K.). Selection of Cross-Linked Peptide Pair Candidates by DetectShift. The CYP2E1-b5 complex digested with trypsin was analyzed by nano-LC-MS on an ESI-QTOF mass spectrometer. Peptides eluted from 20 to 40 min (Figure S-2, Supporting

Information) with ∼1200 MS1 mass spectra, each containing 1540 precursor ions. Due to the use of a split-flow system, the retention time of ions in the 16O-digest lagged behind that of the corresponding ions in the 18O-digest by ∼0.5-1 min. Manual comparison of corresponding ions in both samples within a retention time shift window across the entire chromatographic time would have taken more than 1 month. However, DetectShift narrowed the number of cross-linked peptide pair candidates from ∼33 000 precursor ions to 29 within minutes using the parameters provided (Figure 2). Identification and Characterization of Cross-Linked Peptide Pairs by IdentifyXLink and AssignXLink. The candidates of cross-linked peptide pairs selected by DetectShift subsequently underwent LC-MS/MS analysis on a hybrid IT-FT-ICR MS using an inclusion list that contained the retention time and m/z values of the candidate ions. As an example, Figure S-3 (Supporting Information) shows a quadruply charged ion at m/z 907.43 selected by DetectShift. The incorporation of three 18O atoms identified it as a cross-linked peptide pair candidate. A tandem mass spectrum obtained for this ion is shown in Figure 3. With the parameters noted in Figure 4, the observed precursor ion mass and the tandem mass spectrum of the precursor ion as input, IdentifyXLink assigned this precursor ion to an intermolecular cross-linked peptide pair, b5(E48-R68)-CYP2E1(Y423-K434). While DetectShift, the first program in our software package, is specifically for 18O-labeling experiments, IdentifyXLink is for a more general application to the characterization of cross-linked peptide pairs. Unlike other programs (such as ASAP,3 FindLink,8 X-Link,13 NIH-XL,14 and CLPM15) designed to select candidates of cross-linked peptide pairs by matching peptide masses, IdentifyXLink is designed to characterize a cross-linked peptide pair by selectively streamlining the processes using peptide mass and associated tandem mass spectrum. Compared to a program designed for the identification of crosslinked peptide pairs at the proteomic level,16 IdentifyXLink narrows down the number of investigated proteins and conducts an exhaustive search of the protein sequence space regardless of the protease used. Therefore, IdentifyXLink tolerates expected and unexpected amide bond cleavages that occur during proteolysis and allows residue modifications that arise in different experimental designs.

Figure 2. Interface of DetectShift and the parameter setting for the selection of cross-linked peptide pair candidates.

Analytical Chemistry, Vol. 78, No. 7, April 1, 2006

2147

Figure 3. IT-FT-ICR tandem mass spectrum of the precursor ion [M + 4H]4+ ) 907.43. The fragment ions generated exclusively by amide bond cleavages are labeled in red for single-cleavage ions and in blue for double-cleavage ions. (See Table S-2 for the tandem mass spectrum assignments.)

Figure 4. Interface of IdentifyXLink and the parameter setting for the identification of the cross-linked peptide pair ion [M + 4H]4+ ) 907.43 at 27.3 min.

In the identified intermolecular cross-linked peptide pair, b5(E48-R68)-CYP2E1(Y423-K434), there is more than one pair of residues that can possibly be cross-linked, so the location of the cross-linked sites depends on a complete assignment of the 2148 Analytical Chemistry, Vol. 78, No. 7, April 1, 2006

fragment ions in the tandem mass spectrum. Program AssignXLink generates an extensive library of fragment ions for the peptide pair and conducts tandem mass spectrum assignments (Table S-1, Supporting Information). Among the 151 assigned

fragment ions, those generated exclusively by amide bond cleavages are labeled (Figure 3; Table S-2, Supporting Information) with the single-cleavage ions in red and the double-cleavage ions in blue. AssignXLink facilitates the unambiguous identification of the cross-linked residue pair, D53 in b5 and K428 in CYP2E1. CONCLUSION We have developed Pro-CrossLink, a suite of software programs including DetectShift, IdentifyXLink, and AssignXLink, for the rapid and automated identification of cross-linked peptide pairs. To access the software Pro-CrossLink and obtain more information, please visit http://depts.washington.edu/medchem/ faculty/NelsonS.html or http://goodlab.mchem.washington.edu/. (13) Taverner, T.; Hall, N. E.; O’Hair, R. A.; Simpson, R. J. J. Biol. Chem. 2002, 277, 46487-92. (14) Sinz, A.; Wang, K. Biochemistry 2001, 40, 7903-13. (15) Tang, Y.; Chen, Y.; Lichti, C. F.; Hall, R. A.; Raney, K. D.; Jennings, S. F. BMC Bioinformatics 2005, 6 (Suppl 2), S9. (16) Chen, T.; Jaffe, J. D.; Church, G. M. J. Comput. Biol. 2001, 8, 571-83.

ACKNOWLEDGMENT This work was supported by NIH grant GM32165 (S.D.N.), the UW NIEHS sponsored Center for Ecogenetics and Environmental Health grant P30ES07033, an NCRR high-end instrumentation award 1S10RR17262-01 (D.R.G.), and the WWAMI RCE for biodefense and emerging infectious diseases 1U54AI57141-01 (D.R.G.). SUPPORTING INFORMATION AVAILABLE Figures S-1, S-2, and S-3, Tables S-1 and S-2, and Experimental Procedures. This material is available free of charge via the Internet at http://pubs.acs.org.

Received for review July 27, 2005. Accepted February 3, 2006. AC051339C

Analytical Chemistry, Vol. 78, No. 7, April 1, 2006

2149