An Integrated Approach To Identifying Chemically Induced

Center for Molecular and Cellular Toxicology, Division of Pharmacology ..... an LCQ electrospray ion trap mass spectrometer (ThermoFinnigan, San Jose,...
0 downloads 0 Views 1MB Size
598

Chem. Res. Toxicol. 2003, 16, 598-608

An Integrated Approach To Identifying Chemically Induced Posttranslational Modifications Using Comparative MALDI-MS and Targeted HPLC-ESI-MS/MS Maria D. Person, Terrence J. Monks, and Serrine S. Lau* Center for Molecular and Cellular Toxicology, Division of Pharmacology & Toxicology, College of Pharmacy, The University of Texas at Austin, Austin, Texas 78712 Received December 4, 2002

Identification of multiple and novel posttranslational modifications remains a major challenge in proteomics. The present approach uses comparative analysis by matrix-assisted laser/ desorption ionization (MALDI) MS of proteolytic digests from control and treated proteins to target differences due to modifications, without initial assumption as to type or residue localization. Differences between modified and unmodified digest MS spectra highlight peptides of interest for subsequent tandem mass spectrometry (MS/MS) analysis. Targeted HPLCelectrospray ionization (ESI)-MS/MS is then used to fragment peptides, and manual de novo sequencing is used to determine the amino acid sequence and type of modification. This strategy for identifying posttranslational modifications in an unbiased manner is particularly useful for finding modifications produced by exogenous chemicals. Successful characterization of chemically induced posttranslational modifications and novel chemical adducts is given as an example of the use of this strategy. Histone H4 from butyrate-treated LLC-PK1 cells is separated on a gel into bands representing different overall charge state. Bands are analyzed by comparative MALDI-MS and LC-MS/MS to identify the sites of methylation and acetylation. Previous attempts to identify chemically adducted proteins in vivo have been unsuccessful in part due to a lack of understanding of the final adduct form. Cytochrome c is adducted in vitro with benzoquinone, an electrophilic metabolite of benzene capable of interacting with nucleophilic sites within proteins. De novo sequencing identifies a novel cyclized diquinone adduct species as the major reaction product, targeting Lys and His residues at two specific locations on the protein surface. This unpredicted reaction product is characterized using our unbiased methods for detection and demonstrates the important influence of protein structure on chemical adduction.

Introduction The full characterization of chemically induced PTMs1 remains a major challenge in the field of toxicology (1). Selective detection techniques such as radiolabeling and western analysis have been used to identify specifically modified proteins, and MS is used increasingly to determine the sites and types of modification. However, the relative abundance of modifications may be low, and in mass spectrometric analysis, modified peptides may not be detectable among a large background of unmodified peptides. In other instances, the modifications may be reasonably abundant but appear at unexpected masses, such as multiple modifications of the same peptides or novel modifications produced by chemical treatment. Various strategies have been used to increase the success of mass spectrometric detection of PTMs, and one such strategy involves enrichment and/or selective detection of suspected or predicted modifications. This approach * To whom correspondence should be addressed. Tel: (512)471-5190. Fax: (512)471-5002. E-mail: [email protected]. 1 Abbreviations: MALDI, matrix-assisted laser/desorption ionization; ESI, electrospray ionization; MS/MS, tandem mass spectrometry; PTM, posttranslational modification; TOF, time-of-flight; PSD, postsource decay; QqTOF, quadrupole time-of-flight; SALSA, scoring algorithm for spectral analysis; ACN, acetonitrile; PAWS, protein analysis work sheet.

has been employed for the detection of phosphorylated proteins and peptides by use of immunoprecipitation or metal affinity chromatography, followed by the selective mass spectrometric detection of the phosphopeptides (25). Selective detection of acetylated Lys has been achieved by monitoring a Lys specific immonium ion (6). Several studies have used MALDI-TOF to compare and identify differences between samples, either at the protein level or following proteolysis (7-13). However, these analyses often do not identify the specific site of the modification. Enzymatic treatments to either add or remove specific modifications have also proved to be a particularly useful strategy. For example, the use of protein phosphatases to dephosphorylate peptides permits the application of MALDI alone to identify the phosphorylation sites (11). A major drawback of the MALDI-TOF approach when applied to the analysis of PTMs is the difficulty of sequencing peptides with MALDI PSD fragmentation. While it is possible to obtain excellent PSD spectra, the spectra are frequently incomplete due to the low efficiency of fragmentation and the necessity for collecting multiple spectra at different mirror ratios. If the site and/ or type of modification are unknown, then high quality fragmentation spectra are required for de novo sequencing. Thus, while MALDI-TOF is a powerful tool for comparative spectral analysis, when the type or mass of

10.1021/tx020109f CCC: $25.00 © 2003 American Chemical Society Published on Web 04/16/2003

Comparative MS Analysis of Protein Modifications

modification is not known, it may not provide the sequencing data necessary to identify the modification. Alternatively, the high quality sequencing information available via MS/MS fragmentation has been used to characterize modifications first identified by MALDITOF. This strategy was used successfully to identify amino acid mutations responsible for transthyretin variants seen in clinical samples (14). Comparative MALDITOF in conjunction with phosphatase treatment has been used to select phosphopeptides for LC-MS/MS analysis (15). Histone acetylation and methylation sites have also been identified by combined MALDI PSD and ESI-MS/ MS (16, 17). The recently developed MALDI-TOF/TOF instrument (18) and MALDI-QqTOF instruments (19, 20) can also deliver high quality MS/MS spectra and have been successfully used for phosphopeptide analysis with or without metal affinity chromatographic enrichment (5, 21). An alternative approach to enrichment for a specific type of modification uses search algorithms to mine the collected mass spectrometric data for a variety of modifications. Currently used protein identification database search engines have limited capacity for identifying modifications, due to combinatorial limitations. Software specifically designed to find modications has thus been developed. The FINDMOD program has been used to automatically identify potential modifications from a set of 22 possible modifications using the list of peptide masses, with MALDI PSD used to confirm the identification (22). A multidimensional LC-MS/MS shotgun approach employs software algorithms able to detect four kinds of modifications (phosphorylation, acetylation, methylation, and oxidation) in complex samples (23). Another promising approach employed by the SALSA program uses characteristic prominent fragment or loss ions or sequence tags to rank the spectra (24-26). The approach can successfully identify glycosylation, phosphorylation, and GSH conjugated sites. While the exact mass of the modification does not need to be known, the type of modification and its characteristic fragmentation ions or the amino acid sequence targeted are required. The strategies developed to date are based on finding the site of a predicted type of modification, with a known modification mass. No software-based technique is able to simultaneously search the data for the over 200 known protein modifications (27). New modifications may be discovered in the future, and known modifications may be found on unexpected residues. Besides natural modifications, chemically induced protein modifications can also occur and may play a critical role in the toxicological consequences following chemical exposure. The effects of toxic chemicals on proteins is largely unknown. The chemical may induce changes in patterns of normal PTMs, or the chemical itself (or a metabolite thereof) may become adducted to target proteins. The form of the protein adduct may vary and may undergo subsequent reactions into additional new reaction products. To identify such changes, the methods used must be as unbiased as possible. Even with naturally occurring PTMs, multiple modifications may exist on a single peptide, and peptides seen postdigestion may represent nonspecific cleavage products, where a mass alone is not sufficient for characterization. To analyze these types of samples, we have developed an unbiased strategy that combines comparative MALDI-TOF with targeted MS/ MS specifically designed for identifying unknown modi-

Chem. Res. Toxicol., Vol. 16, No. 5, 2003 599

fications. An initial MALDI screen, comparing treated/ control samples to identify any differences between the spectra, allows us to rapidly assess whether potentially modified peptides are detectable in the digest mixture, giving a qualitative feel for abundance. Unlike prior applications of comparative MALDI-MS, no assumption is made with respect to the type, number, or location of the modification(s), nor to the specificity of the protease used. Accurate mass measurements can be used to identify the type of modification involved. Multiple sample preparation techniques can be easily compared. LC-MS/MS sequencing then provides information on type and location of modification. Using targeted MS/MS based on the initial MALDI results enables acquisition of high quality spectral information on less abundant species. The high quality MS/MS is particularly useful when identifying multiple and chemically induced modifications and becomes critical in the analysis of novel chemical adducts. This approach is applied to the analysis of histone H4 modifications from butyrate-treated LLC-PK1 cells and benzoquinone adducts on cytochrome c. Histone protein function is controlled by PTMs (28). Histones undergo modification by acetylation, methylation, phosphorylation, ADP-ribosylation, ubiquitination, and glycosylation with multiple target sites, many localized onto the protein N terminus (29). Thus, modified species will exist in a variety of multiply modified forms. By chemically inhibiting histone deacetylase activity followed by Triton-acidurea gel electrophoresis, which separates proteins based on relative charge, we obtain samples enriched in native acetylation sites. MS/MS is then used to identify different sites and types of modification. The toxicological activity of quinones resides not only in their ability to undergo “redox cycling” and to thereby create an oxidative stress, but also their electrophilic character permits interaction with cellular nucleophiles, including proteins (30). While the toxicological activity of quinones has been demonstrated, the mechanism by which this occurs is not fully understood. The protein adduct(s) produced in vivo have yet to be characterized. It is known that quinones preferentially target Cys residues within small peptides (24). However, whether quinones target the same residues in the intact protein or whether the higher order structure of the protein induces novel chemistries remains unknown. Here, we demonstrate that a novel cyclized diquinone adduct is the dominant product in the reaction of benzoquinone with cytochrome c, a key protein involved in apoptosis (31, 32).

Materials and Methods Chemicals. HPLC grade solvents were purchased from EM Science (Cincinnati, OH); acetic acid was purchased from Aldrich (Milwaukee, WI); 1,4-benzoquinone was purchased from Fluka (Buchs, Switzerland); sequencing grade chymotrypsin was purchased from Roche (Indianapolis, IN); sequencing grade trypsin was purchased from Promega (Madison, WI); R-cyano4-hydroxycinnamic acid and sinapinic acid were from the Sequazyme Peptide Mass Standards Kit (Perseptive Biosystems, Framingham, MA); and other reagents and proteins were from Sigma (St. Louis, MO). Hyperacetylation and Separation of Histone H4. LLCPK1 cells (mini pig renal proximal epithelial cells) were pretreated with a histone deacetylase inhibitor, sodium butyrate (5 mM), for 12 h. Histones were extracted from the cells, and

600

Chem. Res. Toxicol., Vol. 16, No. 5, 2003

70 µg of protein was loaded and electrophoretically resolved on a Triton-acid-urea gel. Proteins were overloaded on the gel to permit visualization of the acetylated histone subtypes on a Coomassie Blue-stained gel. The gel was dried onto filter paper for storage. In-Gel Digestion. Separated histone H4 bands with one, three, and five acetyl groups were digested in-gel with chymotrypsin using a modified Rosenfeld procedure (33). Individual gel bands were rehydrated and destained in 50% methanol and 5% acetic acid overnight, and the filter paper backing was removed. The individual bands were cut into 1 mm pieces. Gel pieces were dehydrated with ACN, and residual ACN was evaporated in a SpeedVac. Gel pieces were subjected twice to washing (100 mM NH4HCO3 for 10 min) and dehydration (5 min in ACN). Gels were dried for 2-3 min in a SpeedVac and rehydrated on ice with 20 ng/µL of porcine Sequencing Grade Chymotrypsin in pH 8, 50 mM NH4HCO3 for 10-15 min. Gel pieces were digested overnight at room temperature. After they were digested, peptides were extracted twice in 75 µL of 5% formic acid/50% ACN. The samples were then dried to about 10 µL in a Speed-Vac and desalted using C18 ZipTips (Millipore, Bedford, MA) according to the manufacturer’s protocol using 5 µL of a 0.1% trifluoroacetic acid, 50% ACN elution buffer. Samples (0.5 µL) were subjected to MALDI-TOF analysis. Remaining sample volumes were dried and resuspended in 0.1% trifluoroacetic acid in deionized distilled water prior to LC-MS/ MS analysis. Cytochrome c Adduction with Benzoquinone. Horse heart cytochrome c was dissolved in deionized distilled water at a concentration of 10 mg/mL. 1,4-Benzoquinone was dissolved in methanol at 20 mg/mL. Cytochrome c was reacted with 1,4benzoquinone at a molar ratio of 1:10 at room temperature for 15 min. The samples were centrifuged using Microcon 10 kDa centrifugal filter membrane tubes (Millipore) and resuspended in deionized distilled water two times to remove excess benzoquinone. Samples were reduced with 10 mM DTT in the dark for 45 min and then alkylated with 20 mM iodoacetamide in the dark for 1 h. The proteins were centrifuged through Microcon filters again two times and diluted in 60 mM NH4HCO3, pH 6. Each protein was spotted onto the MALDI target, and whole protein spectra were acquired. Finally, the samples were digested overnight at room temperature in the dark with Sequencing Grade Trypsin added at 2% (w/w). The digested proteins were analyzed by MALDI-TOF and LC-MS/MS. MALDI-TOF. MALDI-TOF spectra were acquired on the delayed extraction 1.3 m linear flight tube Voyager-DE PRO (PerSeptive Biosystems, Framingham, MA) instrument in the positive ion mode. The instrument was equipped with a nitrogen laser operating at 337 nm with a 20 Hz firing rate. For digested samples, the reflectron detector was used with the low mass gate set at 600, and spectra acquired over the mass range of 600-2500 Da for solution digest samples or 900-2500 Da for in-gel digest samples. The standard instrument method parameter file “angiotensin_reflector” was used for data acquisition, with a resolution of 9000 typically obtained. The matrix used was R-cyano-4-hydroxycinnamic acid, mixed 1:1 with the sample and drop dried on a stainless steel target in a total volume of 1 µL. Histone H4 digest spectra were internally calibrated on 4-6 theoretical histone chymotryptic digest peptide masses that were sufficiently abundant in the spectra and the chymotrypsin autolytic fragment at m/z ) 1523.8182. Cytochrome c digest spectra were calibrated on theoretical cytochrome c tryptic peptide masses. For the whole protein, spectra were acquired in linear mode using the “myoglobin_linear” instrument settings file over the mass range 3000-30 000 Da, with single point internal calibration using the cytochrome c peak and typical resolution of 300. Sinapinic acid was used as the matrix for protein spectra. Theoretical Digests. The protein sequences of histone H4, H4_HUMAN, and cytochrome c, CYC_HORSE, were obtained from the SWISS-PROT database (ca.expasy.org/sprot/) (34). MSDIGEST in the Protein Prospector suite (prospector.ucsf.edu)

Person et al. loaded on the Proteomics Solution 1 data analysis system (Perseptive Biosystems) was used to calculate theoretical digest masses based on a chymotryptic or tryptic digest, with up to two missed cleavages and unmodified cysteines (35). The N-terminal acetylation was a fixed modification, and for histone H4, the K acetylation was variable. The PAWS program v. 2002.01.05 (Genomics Solutions, Ann Arbor, MI; http://65.219.84.5/paws.html) was used for identification of peptides based on accurate mass measurements using protein sequences obtained from the SWISS-PROT database. For both proteins, the sequence was modified to include acetylation of the N-terminal residue. The Find Any Peptide search mode was used with 20 ppm search tolerance. Modification masses of + 42.01, +84.02, +126.03, or +168.04 were used for histone. Modification masses of +103.99, +106.01, and +108.02 were used for cytochrome c. Monoisotopic masses with MH+ were used for the search. LC-ESI-MS/MS. For histone experiments, a microspray Magic 2002 HPLC system (Michrom, Auburn, CA) coupled inline with an LCQ electrospray ion trap mass spectrometer (ThermoFinnigan, San Jose, CA) was used. A resistive splitter (Michrom Magic Variable Splitter) was used to split the HPLC flow from 45µL/min input to approximately 0.4 µL/min sent through the column. A custom built microspray interface was mounted on the LCQ according to the design of Gatlin et al. (36). The column was a PicoFrit (New Objective, Woburn, MA) Aquasil C18 (5 µm) 75 µm i.d. × 5 cm column. A 60 min linear gradient from 5 to 65% B was used to elute the digest peptides with liquid phases A (0.5% acetic acid, 0.005% TFA in water) and B (0.5% acetic acid, 0.005% TFA, 90% ACN, 10% water). For the cytochrome c digest, the microbore HPLC system was used to separate the tryptic fragments using a 0.5 mm × 5 mm C18 column (Michrom) and a 60 min linear gradient from 5 to 65% B at 20 µL/min, in-line with the standard ESI interface. The LCQ acquired a single MS over the m/z range 360-2000 followed by three data-dependent MS/MS scans on the most intense ions with dynamic exclusion. Spectra were acquired in the centroid mode. For cytochrome c, targeted MS/MS was acquired on the singly, doubly, and triply charged ions of the peptide at m/z 1869, and the singly and doubly charged ions of the 1109, 1229, and 1478 peptides selected from the differential MALDI spectra. Peptide sequences were identified using the database search software TurboSEQUEST. The search was performed using a database containing only the protein of interest and assuming no enzyme digest. For histone H4, Lys and Ser acetylation (+42) were entered as variable modifications, and for cytochrome c, Lys, His, and Arg were allowed to have variable modification masses of either +104, +106, or +108. Resulting hits with Xcorr scores greater than 1 were verified by manual inspection. De Novo Peptide Sequencing and Adduct Identification. Manual de novo sequencing of MS/MS spectra was used to identify peptide sequence and adduct mass and location. From the differences between three and five adjacent ions in the MS/ MS spectra, a short amino acid sequence tag is generated. The sequence may represent either an N to C terminal tag if it is a b ion series or a C to N terminal tag if it is a y ion series. These two possibilities are searched for in the cytochrome c sequence using the PAWS program feature Find Amino Acids, and one provides a unique match. A tryptic peptide is then assigned to the spectrum, and the theoretical mass is calculated using MSISOTOPE in Protein Prospector. This value is compared to the experimentally measured mass derived from the MALDI-MS with a mass accuracy of 20 ppm. The difference between these numbers represents the mass of the adduct. Molecular Modeling. The 3D model of the cytochrome c structure was generated using the horse cytochrome c sequence obtained from the Swiss-Prot database (34) at http:// us.expasy.org/sprot/ by the program Swiss-Model and viewed with the Swiss-PdbViewer (37) available at the Swiss-Prot website http://www.expasy.ch/spdbv/.

Comparative MS Analysis of Protein Modifications

Chem. Res. Toxicol., Vol. 16, No. 5, 2003 601

Table 1. Peptides with Acetylation/Methylation Modifications in Histone H4 MALDI Spectra H4 nonacetylated

H4 diacetylated

958.548a

H4 tetraacetylated

958.537, 1000.561 (1-10 + 1 or 2 acetyls)

(1-10 + 1 acetyl)b

1418.872 (11-22 + 2 acetyls + 2 methyls) 1696.946 (1-17 + 3 acetyls) 1834.006 (1-18 + 3 acetyls)

1042.563 (1-10 + 3 acetyls) 1404.853 (11-22 + 2 acetyls + 1 methyl) 1418.869 (11-22 + 2 acetyls + 2 methyls) 1780.982 (1-17 + 5 acetyls) 1918.037 (1-18 + 5 acetyls)c

a Experimentally measured masses. b Sequence and modification assignments are based on agreement with theoretical MALDI-MS masses within a mass tolerance of 20 ppm. c Assignment verified by MS/MS.

Figure 1. MALDI-TOF spectra of histone H4 non-, di-, and tetraacetylated bands digested in-gel with chymotrypsin. Peaks that are unique to a single spectrum are marked with an asterisk. Peaks shifting (indicated by arrows) by the mass of one acetyl group (42 Da) or two acetyl groups (84 Da) are underlined. Spectra are individually normalized to the most intense peak. Chymotrypsin autolysis peaks are shown in italics.

Results Histone H4 Acetylation. Mini pig kidney LLC-PK1 cells were pretreated with the histone deacetylase inhibitor sodium butyrate to increase the level of histones acetylated at the biologically relevant sites. Histones were extracted from these cells and separated on a Triton-acid-urea gel (Tikoo and Monks, unpublished results). Five bands are present for histone H4, representing increasing levels of acetylation. Because the N-terminal Ser of histone H4 is constituitively acetylated, the lowest band represents a single acetylation at this site, which is called the nonacetylated form by convention. The higher bands have from one to four variable acetyl groups on histone H4. Bands 1, 3, and 5, representing non-, di-, and tetraacetylated histone H4, were excised and digested in-gel with chymotrypsin. The MALDI-TOF spectra resulting are shown in Figure 1. The three spectra are aligned, and those peaks that differ are highlighted by asterisks. Several of the peaks appear to show a progressive shift either of +42 or +84 when moving between the spectra of non- to di- to tetraacetylated histone H4, as shown by underlining in Figure

1. Such a shift would be expected when additional acetylation is present. For example, the peak at m/z 958 is seen in nonacetylated histone, while 958 and 1000 are seen in diacetylated and 1042 in tetraacetylated histone. The peak at 958 is assigned to the chymotryptic fragment for the protein N-terminal, SGRGKGGKGL, with Nterminal acetylation. There are two additional sites for acetylation in the peptide on K5 and K8, one of which is partially occupied in the diacetylated form and both occupied in the tetraacetylated form. Likewise, a shift from m/z 1834 in the diacetylated band to 1918 in the tetraacetylated band can be accounted for by the addition of two acetyl groups. It is important to remember that there are a number of factors besides acetylation that can account for differences between the spectra. These include other types of modifications, differences in digestion cleavage patterns between bands, differing levels of chymotrypsin autolysis and keratin contamination, and differences in peak signal-to-noise between spectra. Using internal calibration, the mass accuracy of the MALDI mass measurement is