Use of MEDUSA-Based Data Analysis and Capillary HPLCIon-Trap

Use of MEDUSA-Based Data Analysis and Capillary HPLC-Ion-Trap. Mass Spectrometry To Examine Complex Immunoaffinity Extracts of. RbAp48. Tarikere ...
0 downloads 0 Views 202KB Size
Use of MEDUSA-Based Data Analysis and Capillary HPLC-Ion-Trap Mass Spectrometry To Examine Complex Immunoaffinity Extracts of RbAp48 Tarikere Gururaja,† Weiqun Li,† Jim Bernstein, Donald G. Payan, and D. C. Anderson* Rigel, Incorporated, 240 East Grand Avenue, South San Francisco, California 94066 Received February 27, 2002

To examine the Jurkat cell interaction partners of RbAp48, we digested entire immunoaffinity extracts with trypsin and identified potential interacting proteins using one- and two-dimensional microcapillary HPLC-ion-trap mass spectrometry. An Oracle-based automated data analysis system (MEDUSA) was used to compare quadruplicate anti-RbAp48 antibody affinity extracts with two sets of quadruplicate control extracts. The anti-RbAp48 extracts contained over 40 difference 1D gel bands. We identified all known proteins of the NuRD/Mi-2 complex including human p66. Three potential homologues of members of this complex were also found, suggesting that there may be more than one variant of this complex. Eleven proteins associated with RNA binding or pre-mRNA splicing were observed. Four other proteins, including a putative tumor suppressor, were identified, as were 18 ribosomal proteins. There was little overlap with RbAp48-interacting proteins defined by yeast two-hybrid methods. These results demonstrate the analysis of a complex immunoaffinity extract and suggest a more complex cellular role for RbAp48 than previously documented. Keywords: affinity mass spectrometry • protein interactions • proteomics • RbAp48 • protein interaction networks • capillary LC/MS/MS

Introduction The determination of interacting proteins or protein complex members for a number of proteins has utilized both interaction screens in lower organisms and isolation of individual complexes. To more broadly examine potential interacting partners before detailed biochemical studies of individual complexes, it may be useful to have methods that do not depend on previous knowledge of interacting proteins or guesses of potential interactors, as required for Western blotting. Methods are also needed for a direct examination of interactors in the context of the exact cells of interest rather than relying on extrapolation of interactions obtained using obligate fusion proteins (baits and library members) in lower organisms that may lack the complex regulatory mechanisms that can control protein interactions. The nuclear protein RbAp48, which contains six WD repeat domains, is an important subunit in the core of different protein complexes that directly or indirectly control gene expression. RbAp48 is part of the histone deacetylase complex that binds the tumor suppressor and cell cycle regulatory protein Rb,1,2 part of the chromatin remodeling complex chromatin assembly factor-1 where it may bind histones,3 part of several chromatin remodeling/transcriptional repression complexes,2,4-7 and part of a transcriptional activation com* To whom correspondence should be addressed. E-mail: dcanderson@ rigel.com. † These authors contributed equally to this work. 10.1021/pr0255147 CCC: $22.00

 2002 American Chemical Society

plex.8 RbAp48 is a constituent of the four-protein core of histone deacetylase complexes, along with RbAp46, histone deacetylase 1 (HDAC1), and HDAC2.6 An HDAC 1 complex deacetylates the global tumor suppressor p53, repressing p53dependent transcriptional activation and modulating p53dependent cell cycle arrest and apoptosis.9 Complexes containing RbAp48 are thus likely to be important for oncology. Here, as a first step in the construction of an integrated system to address complex human cell interactions, we have used tandem mass spectrometry-based sequencing of affinity extracted proteins, from quadruplicate affinity extracts, to identify a number of known interacting partners and other proteins reproducibly present in the extracts. We first screened antibodies to RbAp48 for use in affinity extraction from lysed T cell acute lymphoblastic leukemia Jurkat tumor cells. Due to the complexity of the data generated, we developed software to allow efficient data processing. Using the Oracle database MEDUSA to control different programs which digitally subtract control affinity extracts from trypsinized cellular affinity extracts analyzed by one- and two-dimensional capillary high-performance liquid chromatography (HPLC)-mass spectrometry,10-11 we examine the use of a reproducibility filter requiring the presence of a protein in at least half of the quadruplicate affinity extracts. We also compare proteins identified from a nonredundant human protein database with those identified from the 90% complete Ensembl human genome database. Besides detection of nine known RbAp48 interacting proteins, we find 36 additional proteins that can be divided into Journal of Proteome Research 2002, 1, 253-261

253

Published on Web 05/02/2002

research articles several functional classes. The data suggest the hypothesis that RbAp48 may be involved in functions additional to those previously assumed, that a single protein may be involved in a number of different cellular interactions or complexes, and that cellular interaction networks derived in this fashion may be quite different from those derived from traditional interaction screens. The use of MEDUSA, tested here, allows the comparison of complex datasets and facilitates a more rapid examination of interacting partners of novel cellular proteins, a first step in defining their mechanism of cellular activity. This may be particularly appropriate for the examination of novel peptide and protein hits derived from functional screens in mammalian cells.

Experimental Section Cell Lysis, Immunoaffinity Extraction, and Gel Electrophoresis. Human Jurkat acute lymphoblastic leukemia cells (ATCC, Manassas, VA) were cultured in RPMI 1640 medium (Mediatech, Herndon, VA) supplemented with 10% fetal bovine serum, penicillin, and streptomycin. Cells (108 for each affinity extraction) were pelleted, washed twice in phosphate-buffered saline, and suspended in 5 mL of 4 °C 2% Triton X-100 containing a cocktail of protease inhibitors (Complete Tablets, Boehringer Mannheim, Germany), 20 mM tris buffer pH 7.2, 0.15 M sodium chloride, and 1 mM ethylenediamine tetraacetic acid (EDTA). The cells were homogenized by repeated uptake into a 1 mL pipetman tip and then sonicated on ice using 20 pulses (1 pulse/s) at maximum recommended power in a Branson (Danbury, CT) probe-tip sonicator. After centrifugation at 14000g for 20 min at 4 °C, the supernatant was cleared of proteins that may bind the beads or protein A/G by end-overend tumbling with 250 µL of a suspension of agarose beads covalently cross-linked to protein A/G plus (Santa Cruz Biotechnology, Santa Cruz, CA). The protein content was measured using a Micro BCA protein kit (Pierce Chemical, Rockford, IL). The protein concentration was normalized to 5 mg/mL in all the affinity extractions. After centrifugation, 1 mL of the supernatant was added to 250 µL of a slurry of agarose beads covalently attached to protein A/G plus previously crosslinked12 to the anti-RbAp48 antibody 13D10 (Upstate Biotechnology, Lake Placid, NY) using dimethylpimelimidate (Pierce Chemical Co., Rockford IL). Quadruplicate immunoextractions were carried out overnight at 4 °C. Beads containing RbAp48 and bound proteins were washed three times with 0.05 M Tris0.15 M sodium chloride-0.1% Triton X-100 followed by two washes with 0.05 M Tris-0.15 M sodium chloride-1 mM EDTA. Quadruplicate control extracts under identical conditions were obtained using an anti-glucuronidase rabbit polyclonal antibody and separately a rabbit polyclonal anti-GFP antibody (Molecular Probes, Eugene OR). Both antibodies were independently cross-linked to protein A/GPlus-agarose. The beads containing extracted proteins were washed with 0.05 M Tris0.15 M sodium chloride with no detergent before addition of trypsin. Washed beads were boiled in SDS-PAGE 2X sample buffer (Novex, San Diego, CA) containing 100 mM dithiothreitol (DTT). SDS-PAGE separation was on a Novex (San Diego, CA) 4-20% gradient Tris-glycine gel. Proteins electroblotted onto PVDF membranes (Novex, San Diego, CA) were probed with appropriate antibodies and developed using an ECL Plus enhanced chemiluminescence reagent kit followed by detection on ECL hyperfilm (Amersham Pharmacia Biotech, Piscataway, NJ). In separate experiments, DNAse I (Roche Diagnostics, 254

Journal of Proteome Research • Vol. 1, No. 3, 2002

Gururaja et al.

Indianapolis, IN) was added to a final concentration of 100 µg/ mL for 30 min at 4 °C to cleave DNA; its activity was terminated by the addition of EDTA to 50 mM. Proteolytic Digestion of Affinity-Extracted Proteins. Agarose beads (50 µL) from affinity extracts were vortexed in 50 µL of 8 M urea, 0.1 M Tris (pH 8.5), 20 mM methylamine, 20 mM calcium chloride, and 2 mM EDTA for 30 min at room temperature. After centrifugation, DTT was added to the supernatant in a separate tube to 10 mM, and the solution was incubated for 2 h at room temperature. Iodoacetamide was added to 30 mM, the solution was incubated for 1 h in the dark at room temperature, and the alkylation was quenched for 30 min with DTT added to 10 mM. The solution was diluted to 4 M urea, and lys-C endoprotease (Worthington Biochemical Corporation, Lakewood, NJ) was added to 2 µg/mL and incubated at 37 °C for 15 h.10 This allows initial digestion in more denaturing conditions than possible with the use of trypsin alone. The solution was further diluted to 2 M urea, and trypsin was added to 5 µg/mL for a 10 h incubation at 37 °C. The reaction was stopped by addition of 1 µL glacial acetic acid, and samples were stored at -76 °C until further use. Microcapillary HPLC and Mass Spectrometry. Each tryptic digest sample was analyzed on an automatic nano LC-MS/ MS system consisting of an LC Packings (San Francisco, CA) capillary HPLC coupled to an LCQ ion-trap mass spectrometer (ThermoFinnigan, San Jose, CA). A fused silica microcapillary HPLC column (15 cm × 75 µm i.d.) was packed with 5 µm, 100 Å pore Nucleosil C18 resin (Macherey-Nagel, Germany) according to Kennedy and Jorgensen.13 The column was connected to a 15 µm PicoTip (New Objective Inc., Woburn, MA) electrospray tip through a stainless steel zero dead volume union to which the electrospray voltage was applied. An injected sample was first trapped and desalted on a microprecolumn cartridge (LC Packings, San Francisco, CA) at 10 µL/ min for 10 min and back-flushed into the capillary column by an acetonitrile gradient at ca. 200 nL/min. Elution consisted of a 20 min gradient from 100% solution A (5% acetonitrile, 0.1% formic acid) to 90% A-10% solution B (80% acetonitrile, 0.1% formic acid), followed by a 60 min gradient from 10 to 30% B, a 10 min gradient from 30 to 50% B, a 10 min gradient from 50 to 80% B, and a hold at 80% B for 10 min. A “2D” experiment involved LC/MS/MS with precursor ions scanned from 350 to 1800 m/z, and three separate runs, with precursor scans over the mass ranges m/z 350-650, 640-810, and 8001800.11 LC/MS/MS experiments were carried out with datadependent acquisition (one MS scan followed by six MS/MS scans selecting the six most intense precursor ions not on the mass exclusion list) and dynamic exclusion (1 min for a m/z 2 exclusion window). SEQUEST Database Search and MEDUSA Analysis. Collected MS/MS data were analyzed using TurboSequest software (ThermoFinnigan, San Jose, CA), which uses the SEQUEST algorithm, against a human protein sequence database derived from the NCBI nonredundant database (http:// www4.ncbi.nlm.nih.gov) and, separately, from the Ensembl human gene (confirmed and predicted) 1.1.0 database (http:// www.ensembl. org). All human immunodeficiency virus protein sequences were removed from the human protein database before database searches by a database tool in the Xcalibur software (ThermoFinnigan, San Jose, CA). SEQUEST14,15 scores were evaluated both by the criteria of Yates et al.,15 which includes peptides with a ∆Cn of greater than 0.1, and separately by the more stringent criteria of Washburn et al.16 and Gygi et

Analysis of Complex Affinity Extracts

al.17 These criteria include peptides with a ∆Cn of greater than 0.1 and Xcorr values greater than 1.9 for +1 ions, 2.2 for +2 ions, and 3.75 for +3 ions for fully tryptic peptide fragments. Only proteins identified by at least one peptide meeting the more stringent criteria were accepted as identified. All fragment ion spectra were manually examined for matches to peptide sequences after all other data processing was completed. SEQUEST results were summarized and stored in a separate database in MEDUSA. MEDUSA is a proprietary web-based Oracle 8.0 database that we have designed and coded that stores and evaluates mass spectrometry data and SEQUEST output. It runs independently of SEQUEST. It controls the inhouse CGI Perl program MS2filter, which digitally compares all experimental MS/MS spectra in the quadruplicate runs with those found in each of the quadruplicate control runs. The comparison between the most intense peaks (limited to 250 to reduce calculations on noise peaks) in each experimental MS/MS spectrum and all control MS/MS spectra is made using a cross-correlation function18 after searching the control datasets for precursor ions within 1.5 Da of the experimental precursor ion being examined. For identical spectra, the cross-correlation value, representing the ratio of the cross-correlation score to the average autocorrelation score for the two peptides, was used as a cutoff. This value was based on our comparisons of MS/MS spectra known to originate from the same peptide, and from different peptides, and was 0.6. Spectra giving values below this number were classified as different spectra, and the experimental MS/MS spectrum was not removed from the experimental dataset. Spectra with higher values were removed. The Xcorr scores of the +2 and +3 ions associated with each precursor ion in the SEQUEST search were compared, and the ion with the lower score was removed; if Xcorr scores were identical, the +2 ion was chosen.The remaining experimental MS/MS spectra were analyzed using defined cutoffs for individual parameters, for example, selecting peptides with a ∆Cn value of 0.1 or greater,15 or using more stringent criteria.16,17 The individual peptides remaining after this analysis were then blasted19 against the same database used for SEQUEST analysis to search for ambiguities in protein identification, as may occur when more than one different protein contains the peptide, or versions of the peptide with leu and ile or gln and lys interchanged. The number of peptides of differing sequence meeting defined criteria and identifying a protein were then counted. A higher quality identification of a protein results from more independent matching peptides. MEDUSA was then used to calculate the reproducibility of each identified protein from quadruplicate experiments. The final identifications were subject to human inspection. Additional informatics websites used for further bioinformatics analysis, which could be selected by the user, included www.ncbi.nlm.nih.gov/entrez and /blast, //www4.ncbi.nlm.nih.gov/PubMed/, //www.isrec.isb-sib.ch, //blocks.fhcrc.org, //www.ensembl.org, //pfam.wustl.edu, and //bioinformatics.weizmann.ac.il. A report for a single identified protein including the original SEQUEST scores and matching peptides, ambiguities in the identified protein, sequence alignments of all matching proteins, and user-selected informatics was written to a text file stored in the database.

Results and Discussion Affinity Extraction. Figure 1 shows a silver-stained 1D SDSPAGE gel of one of the anti-RbAp48 affinity extracts and a control anti-glucuronidase antibody extract. Over 40 bands are

research articles

Figure 1. 1D SDS-PAGE examination of the anti-RbAp48 antibody-based affinity extraction. A silver-stained gel from an affinity extraction is shown along with a control extraction with an anti-GFP antibody. 1.5 × 108 Jurkat cells were lysed and extracted with the anti-RbAp48 antibody 13D10 covalently attached to protein A/G-Sepharose beads. The affinity extraction has at least 40 bands not present in the control extract. Fortyfive proteins present in at least two of four affinity extracts but not present in eight control runs were identified. The proteins were identified using tryptic digests of the entire affinity extracts.

present in the affinity extracts that are not present in control extracts, including bands ranging in size from ca. 6 kDa to well over 250 kDa. The difference bands can be divided into one or more groups of more intense bands, mostly ca. 50 kDa and above in mass, and more than one set of less intense bands, mainly below ca. 50 kDa in mass. The differences in intensity suggest that there may be both relatively abundant complexes and less abundant complexes. An anti-RbAp48 Western blot (data not shown) confirmed the extraction of RbAp48 with 13D10, but showed no extraction of RbAp48 with the antiglucuronidase antibody. The monoclonal antibody used for these affinity extractions, 13D10, does not appear to recognize RbAp46, a close homologue of RbAp48 that is also an important core component of histone deacetylase complexes, since Western blots of Jurkat cell lysates show a single band comigrating with standard RbAp48 and not a doublet including a lower molecular weight band (data not shown). The RbAp46 extracted thus may be part of one or more complexes containing RbAp48. Treatment of the cell lysate with DNAse I before affinity extraction resulted in the loss of most of the affinity extracted protein, suggesting that intact DNA may be necessary for the formation of some of the RbAp48 complexes or interactions observed here (data not shown). The affinity extractions were done in quadruplicate, the agarose beads for each were extracted with 8 M urea, and after reductioncarboxamidomethylation, the proteins were digested with lys-C Journal of Proteome Research • Vol. 1, No. 3, 2002 255

research articles

Gururaja et al.

Figure 2. Steps involved in MEDUSA-based analysis of the affinity extraction data. MS/MS spectra present in controls are first digitally removed from experimental runs (consisting of sets of precursor ions and their MS/MS spectra). Proteins are then identified using SEQUEST, and peptides that do not fit MS/MS spectra well are removed using the Oracle database MEDUSA. Proteins present in at least two of the four extracts are selected. Proteins with either a gi number or accession number matching the number of a control protein are removed. This deletes proteins that may be identified by different peptides in the experimental and control extracts. MS/ MS spectra of the remaining proteins are then examined, and additional information that may be related to the function of the identified proteins, including identification of homologues using BLAST, identification of domains and motifs, or literature information, is collected into the database.

endoprotease and trypsin. They were then examined by 1D and “2D” microcapillary LC/MS/MS. MEDUSA Analysis. The functions performed by MEDUSA, an Oracle 8 database that stores data and controls other programs that perform individual steps in the analysis of data from quadruplicate affinity extracts, are outlined in Figure 2. Data collected on computers controlling individual mass spectrometers is transferred to a computer running SEQUEST, a database search algorithm.14,15 The results are then stored in and analyzed under the control of MEDUSA. Figure 2 shows the data flow in the MEDUSA-based analysis of the complex RbAp48 affinity extracts. After mass spectrometry, fragment ion spectra present in any of the two control datasets (antiglucuronidase and anti-GFP quadruplicate extractions) were digitally removed using the program MS2filter, and SEQUEST was run on the remaining data. Proteins present in each of the four experimental affinity extracts were compared using MEDUSA, by matching accession or gi numbers, with those present in the two different quadruplicate sets of control affinity extracts. This was necessary since the control datasets sometimes contained different peptides from the same protein than the experimental datasets and the control protein would thus not be removed using MS2filter. This comparison also removes proteins present in the controls identified by peptides not removed by MS2filter, due to the cutoff being set to retain all positive peptides. Proteins present in any of the controls were removed from the experimental dataset. The frequency of remaining proteins present in the quadruplicate affinity extracts 256

Journal of Proteome Research • Vol. 1, No. 3, 2002

was then calculated using MEDUSA. Proteins present two or more times were selected for further analysis. Individual peptide sequences were examined for ambiguity in the proteins they identified, either due to multiple proteins containing the peptide or to uncertainties in mass that are not resolved by the ion trap (leu vs ile, gln vs lys), by a basic local alignment search tool (BLAST) search against the nonredundant database. The mass spectra were then manually compared with SEQUEST-predicted spectra for the peptide sequence, and peptides with unexplained major fragment peaks were discarded. Identified proteins were examined for known function. Novel protein sequences were compared to known proteins using BLAST19 or examined for domains and sequence motifs. The reduction in data complexity at the different steps of data analysis is shown for a typical 1D LC/MS/MS run in Figure 3. The x-axis shows the HPLC elution time, and the y-axis represents the summed intensity of the MS/MS fragment ions. Each peak represents one or more fragmented peptides. The raw data from one run contains 1386 different MS/MS spectra. After treatment with MS2filter, 1092 MS/MS spectra remain. After additional processing using SEQUEST and MEDUSA and the nonredundant human database, 105 different fragment ion spectra remain, identifying (in this particular run) 18 proteins. Many more proteins that could be mistakenly identified as potential binding partners or complex members were removed by the comparisons with control extracts, and by the requirement that the protein be identified in at least two of the four extracts. In the quadruplicate extracts, 24 different proteins

Analysis of Complex Affinity Extracts

Figure 3. Reduction in LC/MS/MS data complexity by filtering with control data. Total fragment ion current chromatograms are shown for a single 1D LC/MS/MS run (top panel). The raw data contains 1386 MS/MS spectra, most of which correspond to peptides. After removal of control MS/MS spectra using the program MS2filter (middle panel), 1092 spectra remain. After identification of cognate proteins using SEQUEST, removal of poorly identified proteins or proteins present in the control datasets, and selection of reproducibly detected proteins, 105 MS/ MS spectra remained. When identifications based on the nonredundant human protein and ENSEMBL databases from both 1D and “2D” LC/MS/MS were abstracted from quadruplicate runs, 45 proteins were identified, 21 of which were unique to “2D” and three of which were unique to 1D runs.

were identified by 1D LC/MS/MS, 42 were identified by 2D LC/ MS/MS, 3 were uniquely identified in 1D experiments, and 21 were identified uniquely in “2D” experiments. In total, 45 proteins present in at least two of the four affinity extracts were identified by a combination of both methods. Nineteen of these proteins were uniquely found searching the ENSEMBL database, and five were found only when searching the nonredundant human database. Thus, a combination of 1D and 2D methods, as well as a search of both databases, maximizes the number of MS/MS spectra that can be assigned to proteins. Inclusion of additional hplc separations, to minimize peptide coelution and to allow more effective “dynamic exclusion” of peptides, may further increase the number of detected interacting partners. A direct isolation of protein complexes will be required to identify the components that might be in individual protein complexes. Identified Interacting Proteins. Table 1 shows a list of identified proteins that are members of the NuRD/Mi-2

research articles RbAp48-containing complex. The frequency and number of peptides used for identification of each protein are listed in Table 1. Every core member6 of the NuRD/Mi-2 complex was identified: RbAp48, RbAp46, HDAC1, and HDAC2. Other known members of this complex including MTA-1, MTA1-like 1/MTA2, methyl CpG binding domain protein 3, and Mi2-β are also present. p66, detected in Xenopus Mi-2 complexes but not so far in the human Mi-2 complex, was present in four of four affinity extracts. Our affinity-mass spectrometry protocol is thus able to identify all nine members of this known protein complex. Although we used published stringent criteria for acceptable peptides,16-17 there were sometimes many more peptides from the identified proteins that were present using previous criteria.15 We also identified potential homologues of members of this complex. KIAA1150 has 41% sequence identity to human p66 over 578 residues, and both contain GATA zinc finger domains found in transcriptional activators. Four of the NuRD/Mi-2 complex proteins contain zinc fingers: p66, Mi-2β (PHD zinc finger), MTA1 (GATA zinc finger), and MTA1-like 1 (GATA zinc finger). Besides KIAA1150, we identified three additional proteins in the affinity extracts that also contained zinc fingers. These include B cell lymphoma/ leukemia 11B protein, which contains six predicted C2H2 DNA-binding zinc fingers, KIAA1762, which contains seven C2H2 zinc fingers and three homeobox domain regions, and Ewing sarcoma breakpoint 1 isoform EWS, which contains a ran-binding protein zinc finger. Some of these may bind DNA or RNA, possibly as part of RbAp48-containing complexes. These results suggest the possibility of additional RbAp48 complexes beyond those already known. Table 2 lists identified nonribosomal proteins thought to be involved in RNA binding or pre-mRNA splicing. Two are homologues of each other20 with 44% sequence identity: FUS/ TLS and Ewing sarcoma breakpoint region 1 isoform EWS. The Ewing sarcoma protein is known to interact with the CREBbinding protein/p300-interacting transcriptional coactivator,21 which could explain its interaction with RbAp48. However, we have not reproducibly detected CREB binding protein or p300 in these affinity extracts. FUS may act as a transcriptional regulator of cytokine receptors,22 and this could provide a connection to RbAp48 function. Both are proto-oncogenes and can transform cells when fused by chromosomal translocation to transcription factors.23-24 Both bind RNA, and FUS is thought to act as part of a pre-mRNA splicing complex,20 interacting with several heterogeneous nuclear ribonucleoproteins (hnRNPs)25 and arg-ser rich splicing factors.26 FUS is also found complexed with RNA polymerase II.27 Nine other RNA binding proteins were also present in these affinity extracts. The first includes a set of four hnRNP-like RNA binding proteins that cannot be distinguished by the detected peptides. These RNA-binding proteins include JKTBP1 and 2, thought to be involved in mRNA biogenesis,28 A+U rich element binding factor, which binds RNAs with AU rich elements,29 and hnRNP-D-like protein, which contains two RNA-recognition elements. Two additional identified RNA binding proteins are arg-ser rich splicing factors 3 and 8, which are thought to be involved in pre-mRNA splicing.30 The fourth protein identified is E1B 55kDa associated protein 5, a nuclear RNA-binding hnRNP family protein that can bind the adenovirus E1B-55kDa oncoprotein and may play a role in nuclear RNA transport.31 Also identified were the RNA-dependent DEAD box helicases p72, 5, and 9, hnRNP A2/B1, and B23. Journal of Proteome Research • Vol. 1, No. 3, 2002 257

research articles

Gururaja et al.

Table 1. Identification of Proteins Affinity-Extracted with Anti-p48 Antibody peptidesb,c

protein

NuRD/Mi-2 Complex RbAp48

representative sequences

comment

Membersa 6

4

RbAp46

7

6

HDAC 1

6

3

HDAC 2

3

2

MTA-1e

3

2

MTA-1 like 1

8

5

methyl CpG binding domain protein 3 Mi-2βd

5

5

6

2

p66

4

3

Potential Homologues of Complex Members KIAA1150 6 3 B cell lymphoma/ leukemia 11Bd

4

1

KIAA1762d

3

1

EAAFDDAVEER HPSKPDPSGECNPDLR YMPQNPHIIATK VHIPNDDAQFDASHCDSDK GVKEEVKLA LHISPSNMTNQNTNEYLEK QRLFENLR QLESLPATHIR SNMSPHGLPAR QFESLPATHIR GHLSRPEAQSLSPYTTSANR NPGVWLNTTQPLCK ACAEDDDEEDEEEEEEEPDPDPEMEHV ERTEEPMETEPKGAADVEK LLRHHYEQQQEDLAR LQNSASATALVSR GTTATSAQANSTPTSVASVVTSAESPASR VIAPNPAQLQGQR TPVVQNAASIVQPSPAHVGQQGLSK IYLEPGPASSSLTPR KPAPLPSPGLNSAAK YLCRQCKMAFDGEAPATAHQR LASLLGLASR

core component of multiple HDAC complexes binds histones; part of core of HDAC complexes deacetylates acetyl-lysines part of core chromatin remodeling complexes overexpression correlates with metastasis part of a complex that deacetylates p53 recruits HDACs to methyl-CpG enriched DNA regions to repress transcription PHD zinc finger DEAD/H box helicase GATA zinc finger subunit of Xenopus Mi-2 complex 41% identical to p66 over 578 residues; GATA zinc finger contains 6 DNA-binding C2H2 zinc fingers; impt. in Hodgkin’s and non-Hodgkin’s lymphoma contains 7 C2H2 zinc fingers

a All proteins are present in four of four affinity extracts except as noted. b Peptides with a ∆C of greater than 0.1.15 n 16 and 17. d Present in three of four affinity extracts. e Present in two of four affinity extracts.

WD repeat proteins are known to be important for mRNA modification.32 RbAp48 is 30% identical in sequence and 52% homologous over 153 residues to yeast prp46, a pre-mRNA splicing factor. The area of homology contains several motifs, including WD repeats. This homology, and the presence of multiple proteins thought to be involved in pre-mRNA splicing suggests that RbAp48 may be part of pre-mRNA splicing or processing complexes, or part of a complex containing both transcriptional and posttranscriptional elements.33 The presence of B23 (which may act as a histone chaperone in nucleosome formation) and two histones involved in nucleosome formation (see below) is consistent with the published function of RbAp48 in nucleosome formation3 and suggests that some of these proteins may be part of a larger complex involved in this function. Table 2 also includes the putative tumor suppressor serine/threonine kinase NKIAMRE.34 We also detected 18 ribosomal proteins present in the p48 affinity extracts but not in control extracts, including 12 60S subunits and 6 40S subunits. Since individual proteins can bind ribosomes,35 it is possible that RbAp48 or a complex containing RbAp48 is interacting with ribosomes or a collection of individual ribosomal proteins. It is possible that we have isolated one or more protein complexes that bind pre-mRNA or that this RNA may have attached ribosomes or ribosomal proteins. The identified proteins discussed above are grouped by complex or by putative function in Figure 4. Known RbAp48 interactors include the nine proteins of the NuRD/Mi-2 complex (also identified here), as well as other proteins in multiple RbAp48 complexes1-9 involved in transcriptional co-repression or co-activation. An additional known RbAp48-interacting protein includes the chromatin assembly factor-1 complex member p150, which was identified in the RbAp48 extract by Western blotting (data not shown). The presence of several different classes of proteins in this complex affinity extract, 258

Journal of Proteome Research • Vol. 1, No. 3, 2002

c

Peptides meeting the criteria in refs

combined with the identification of only a subset of these proteins as the single NuRD/Mi-2 complex, suggests more complexes than are currently known may be involved in RbAp48 function. The assumption that all members of an affinity extract are involved in a single complex underlies the initial organization of data from large-scale protein complex analyses.36,37 This assumption can be true for individual examples when carefully documented36 but, in view of the literature on RbAp48 and our results, may not apply here. Limitations of the Methodology. Requiring presence in at least two of four affinity extracts substantially reduces the number of identified proteins, presumably by removing those binding nonspecifically to the resin-protein A/G-antibodyRbAp48 bait. Proteins present at a low level, that bind to RbAp48 with lower affinity, or that compete with the 13D10 antibody for binding, may be difficult to reproducibly detect with our current limit of detection, ca. 10 fmol for a tryptic peptide in a complex mixture. Examples may include proteins known to be present in RbAp48 complexes, such as SMRT, p300, or CREB-binding protein, each of which we detected in only one of the four affinity extracts. Many of the RbAp48 interacting partners we observed may thus reside in more abundant, relatively stable protein complexes or bind relatively tightly to RbAp48 compared to other documented RbAp48 interactors. Improving our detection sensitivity may allow observation of additional interacting proteins present at lower levels. Use of affinity-selected polyclonal antibodies, or an additional monoclonal antibody with a different epitope, may circumvent the possibility of antibody binding competitively with some complex members. Verification of some of the interactions could be straightforward if specific and tight-binding antibodies reactive with the identified protein existed. Unfortunately, these are not available for many of the proteins identified here, and often

research articles

Analysis of Complex Affinity Extracts Table 2. Identification of Proteins Not in Known P48 Complexesa protein

peptidesb,c

representative sequences

Proteins Associated with RNA Binding Ewing sarcoma breakpoint 2 region 1, isoform EWS

2

RTGQPMIHIYLDK ETGKPKGDATVSYEDPPTAK

FUS/TLSb

7

1

hnRNP-like RNA binding factor arg, ser-rich splicing factor 8 arg, ser-rich splicing factor 3d E1B 55kDa assoc. protein 5d

3

2

1 1 4

1 1 3

RNA-dependent DEAD-box helicase p72

3

2

GEATVSFDDPPSAK APKPDGPGGGPGGSHMGGNYGDDR YHQIGSGKCEIK MFIGGLSWDTSKK YLALHTDLLEEEAR VRVELSNGEK RTDEEGKDVPDHAVLEMK KYNILGTNAIMDK NFYVEHPEVAR APILIATDVASR

DEAD-box helicase 9d

1

1

RISAVSVAER

DEAD-box helicase 5e

1

1

NFYQEHPDLAR

hnRNP-A2/B1d

2

1

SAAGNRAEATESAMEREK

1

1

Other Proteins protein kinase NKIAMREd

1

1

LLQEAKVNSLIKPK

osteosarcoma B1 proteine

1

1

ELNYDELDVEMK

histone H2A.L histone H1B

1 4

1 1

AGLQFPVGR ERSGVSLAALKK KALAAAGYDVEK

B23/nucleophosmin

40S, 60S ribosomal proteins

40S, 60S ribosomal proteinsd 40S, 60S ribosomal proteinse

GPSSVEDIK

comment

binds RNA, transcriptional co-activators; zinc finger; proto-oncogene potential pre-mRNA splicing factor; proto-oncogene one of three pre-mRNA-associated hnRNP family RNA binding proteins regulates pre-mRNA splicing pre-mRNA splicing factor regulates nuclear to cytoplasmic mRNA transport nuclear RNA-dependent ATPase, ATP-dependent RNA helicase, unwinds and anneals RNA nuclear ATP-dependent RNA helicase, unwinds double-stranded RNA and DNA nuclear ATP-dependent DEAD box RNA helicase stimulated by single-strand RNA involved in pre-mRNA processing; forms complexes (ribonucleosomes) with other hnRNPs and RNA in the nucleolus binds nucleolar ribonucleoprotein structures, single stranded nucleic acids; putative ribosome assembly factor; histone chaperone in nucleosome formation; binds Rb putative tumor suppressor (leukemia), cdk-2 and cdk-3 related expressed in Saos-2 osteosarcoma cells; no known motifs/domains a core histone of nucleosomes allows nucleosomes to condense into higher order structures

L10 (3, 2), L13 (1, 1), L14 (1, 1), L26 (1, 1), L31 (4, 4), L34 (2, 1), L39 (1, 1), S14 (4, 1), S19 (1, 1) L21 (2, 2), L23 (1, 1), L28 (2, 1), L44 (2, 1), S6 (2, 1), S25 (3, 1) L3 (4, 3), S7 (2, 1), S17 (1, 1)

a All proteins are present in four of four affinity extracts except as noted. b Peptides with a ∆C of greater than 0.1.15 c Peptides meeting the criteria in refs n 16 and 17. d Present in three of four affinity extracts. e Present in two of four affinity extracts. Numbers for the ribosomal proteins in parentheses represent peptides present meeting criteria in b and in c, respectively.

commercial antibodies, such as those raised against synthetic peptides, do not react well with the corresponding cellular protein. This points to the need, for verification of interactions by Western blotting or reciprocal affinity extracts, for high affinity and specific antibodies for many more human proteins than are currently available. The number of proteins identified here is similar to the number of difference gel bands seen for the affinity extracts. The presence of the antigen for the antibody as one of the most abundant proteins, obvious homologues of known complex members (KIAA1150 and p66), known complex members themselves, and multiple proteins within a class (ribosomal proteins, zinc finger proteins, premRNA splicing proteins), also identified by sequencing of tryptic peptides, would seem to indicate that the affinity extraction may have isolated binding partners of defined types and not random proteins.

About 30 proteins were identified by peptides that do not contain cysteine; thus, use of biotinylated reagents that alkylate cysteine (isotope coded affinity tags or ICAT38) as secondary affinity reagents in this type of approach would cause us to miss these proteins. Attempts at quantitation of differences in amount between the affinity extracts and controls, or between tumor and normal cells, a next step in the application of the comparative analysis presented here, may require a more universal stable isotope label allowing quantitation using each peptide.39 Comparison of Methods To Identify Protein-Interacting Partners. It may be of interest to compare RbAp48 interactions resulting from our affinity mass spectrometry approach with other approaches to detecting protein-interacting partners. A global analysis of yeast protein interactions using the yeast two-hybrid method40 detected one interacting protein for the Journal of Proteome Research • Vol. 1, No. 3, 2002 259

research articles

Gururaja et al.

Figure 4. Summary of proteins identified in RbAp48 affinity extracts. Known RbAp48-interacting proteins include those in the Mi-2/ NuRD complex (upper left box) as well as proteins in other transcriptional co-activator or co-repressor complexes and in chromatin assembly factor 1 (middle left box). Mi-2/NuRD contains four proteins with predicted zinc fingers (blue text, italics). We have identified a homologue of the NuRD complex member p66, namely KIAA1150. We have also identified three other proteins with predicted zinc fingers (blue text, italics). One set of novel interactors includes a set of nonribosomal RNA-binding proteins (upper right box) including several splicing factors, three RNA helicases, and two homologues, FUS/TLS and Ewing sarcoma breakpoint 1 isoform EWS. These suggest an additional function or set of functions for RbAp48-containing complexes involving mRNA or pre-mRNA. Eighteen ribosomal proteins were identified (lower right box) and several other proteins that do not fit into the above categories. The kinase NKIAMRE is a putative tumor suppressor, and five interactors are proto-oncogenes or are overexpressed in different cancers (MTA-1, FUS/TLS, Ewing sarcoma breakpoint 1, osteosarcoma B1 protein, and B cell lymphoma/leukemia 11B protein). This highlights the potential importance of these interactions or complexes in oncology.

RbAp48 yeast homologue (Msi1 or YBR195c), the protein RLF2, the largest subunit of yeast chromatin assembly factor 1 (homologous to the human p150 subunit). A second global analysis41 identified Nem1 (nuclear envelope morphology) and cdc73 (RNA polymerase II accessory protein) but not RLF2. There is thus little overlap between the protein-interacting partners detected by our affinity mass spectrometry approach and the yeast two-hybrid method. For core data representing interactions with at least three hits, the two global two-hybrid interaction analyses have only about 20% shared interactions.41 Interactions from a large-scale analysis of yeast protein complexes36 contained only a 7% overlap with global yeast twohybrid assays. Extrapolating two-hybrid information obtained in yeast to mammalian cells may thus be difficult. Interaction maps based on direct affinity extractions in mammalian cells may thus be very different from maps based on two-hybrid analyses.

Conclusions The use of quadruplicate affinity extracts and controls, combined with 1D and 2D capillary HPLC/MS/MS-based separation and sequencing of tryptic fragments, and with MEDUSA analysis of the resulting complex datasets, has resulted in a greatly simplified set of potential RbAp48interacting proteins, including nine known NuRD/Mi-2 complex members. Such a procedure may provide a general approach for handling datasets resulting from analysis of complex protein mixtures such as those derived from some 260

Journal of Proteome Research • Vol. 1, No. 3, 2002

affinity extracts. Due to the identification of a number of novel potential interactors for RbAp48, one can hypothesize a more complex cellular role for this protein than previously documented. Putative RbAp48-interacting proteins include a number of ribosomal proteins, proteins thought to be part of nucleosomes (two histones, B23, p150) as well as proteins involved in pre-mRNA binding or splicing. It is possible that these are part of one or more large complexes. As a number of proteins are either overexpressed in cancer or act as protooncogenes (MTA-1, FUS/TLS, Ewing sarcoma breakpoint region 1 isoform EWS, osteosarcoma B1 protein, B cell lymphoma/ leukemia 11B protein) or are putative tumor suppressors (the kinase NKIAMRE) some of these complexes will be of interest for further study. For the most interesting of these proteins, the next step will be to physically isolate individual complexes, for example by the retroviral delivery of affinity-tagged proteins of interest. Affinity mass spectrometry-based methods identifying cellular interacting proteins should be particularly appropriate for examination of interactions of novel peptide and protein library member hits active in functional screens of mammalian cells.42-45

Acknowledgment. We thank Drs. Susan Demo and Yasumichi Hitoshi for making the human p66 sequence available. References (1) Qian, Y.; Wang, Y.; Hollingsworth, R.; Jones, D.; Ling, N.; Lee, E. Nature 1993, 12, 648-52.

research articles

Analysis of Complex Affinity Extracts (2) Nicolasj, E.; Morales, V.; Magnaghi-Jaulin, L.; Harel-Bellan, A.; Richard-Foy, H.; Trouche, D. J. Biol. Chem. 2000, 275, 9797-9804. (3) Verreault, A.; Kaufman, P.; Kobayashi, R.; Stillman, B. Cell 1996, 87, 95-104. (4) Kaufman, P.; Kobayashi, R.; Kessler, N.; Stillman, B. Cell 1995, 81, 1105-14. (5) Zhang, Y.; Sun, Z.; Iratni, R.; Erdjument-Bromage, H.; Tempst, P.; Hampsey, M.; Reinberg, D. Mol. Cell 1998, 1, 1021-31. (6) Zhang, Y.; Ng, H.; Erdjument-Bromage, H.; Tempst, P.; Bird, A.; Reinberg, D. Genes Dev. 1999, 13, 1924-35. (7) Wade, P.; Gegonne, A.; Jones, P.; Ballestar, E.; Aubry, F.; Wolffe, A. Nat. Genet. 1999, 23, 62-6. (8) Zhang, Q.; Vo, N.; Goodman, R. Mol. Cell Biol. 2000, 20, 4970-8. (9) Luo, J.; Su, F.; Chen, D.; Shiloh, A.; Gu, W. Nature 2000, 408, 37781. (10) Link, A.; Eng, J.; Schieltz, D.; Carmack, E.; Mize, G.; Morris, D.; Garvik, B. Yates, J., III. Nature Biotechnol. 1999, 17, 676. (11) Spahr, C.; Davis, M.; McGinley, M.; Robinson, J.; Bures, E.; Beierle, J.; Mort, J.; Courchesne, P.; Chen, K.; Wahl, R.; Yu, W.; Luethy, R. Patterson, S. Proteomics 2001, 1, 93-107. (12) Harlow, E. Lane, D. In Antibodies. A laboratory manual; Cold Spring Harbor Laboratory: New York, 2000; pp 522-523. (13) Kennedy, R. Jorgensen, J. Anal. Chem. 1989, 61, 1128. (14) Eng, J.; McCormack, A.; Yates, J., III. J. Am. Soc. Mass Spectrom. 1994, 5, 976-989. (15) Yates, J., III; Eng, J. McCormack, A. Anal. Chem. 1995, 67, 320210. (16) Washburn, M.; Wolters, D.; Yates, J., III. Nature Biotechnol. 2001, 19, 242-247. (17) Gygi, S.; Rist, B.; Griffin, T.; Eng, J.; Aebersold, R. J. Proteome Res. 2002, 1, 47. (18) Yates, J., III; Morgan, S.; Gatlin, C.; Griffin, P.; Eng, J. Anal. Chem. 1998, 3557-3565. (19) Altschul, S.; Gish, W.; Miller, W.; Myers, E. Lipman, D. J. Mol. Biol. 1990, 215, 403-410. (20) Lerga, A.; Hallier, M.; Delva, L.; Orvain, C.; Gallais, I.; Marie, J.; Moreau-Gachelin, F. J. Biol. Chem. 2001, 276, 6807-16. (21) Rossow, K.; Janknecht, R. Cancer Res. 2001, 61, 2690-5. (22) Perrotti, D.; Bonatti, S.; Trotta, R.; Martinez, R.; Skorski, T.; Salomoni, P.; Grassilli, E.;Lozzo, R.; Cooper, D.; Calabretta, B. EMBO J. 1998, 17, 4442-55. (23) Perez-Losada, J.; Pintado, B.; Gutierrez-Adan, A.; Flores, T.; Banares-Gonzalez, B.; del Campo, J.; Martin-Martin, J.; Battaner, E.; Sanchez-Garcia, I. Oncogene 2000, 19, 2413-22. (24) Im, Y.; Kim, H.; Lee, C.; Poulin, D.; Welford, S.; Sorensen, P.; Denny, C.; Kim, S. Cancer Res. 2000, 60, 1536-40.

(25) Calvio, C.; Neubauer, G.; Mann, M.; Lamond, A. RNA 1995, 1, 724-33. (26) Yang, L.; Embree, L.; Tsai, S.; Hickstein, D. J. Biol. Chem. 1998, 273, 27761-4. (27) Bertolotti, A.; Lutz, Y.; Heard, D.; Chambon, P.; Tora, L. EMBO J. 1996, 15, 5022-31. (28) Kamei, D.; Tsuchiya, N.; Yamazaki, M.; Meguro, H.; Yamada, M. Gene 1999, 228, 13-22. (29) Doi, A.; Shiosaka, T.; Takaoka, Y.; Yanagisawa, K.; Fujita S. Biochim. Biophys. Acta 1998, 1396, 51-6. (30) Sarkissian, M.; Winne, A.; Lafyatis, R. J. Biol. Chem. 1996, 271, 31106-14. (31) Gabler, S.; Schutt, H.; Groitl, P.; Wolf, H.; Shenk, T.; Dobner , T. J. Virol. 1998, 72, 7960-71. (32) Neer, E.; Schmidt, C.; Nambudripad, R.; Smith, T. Nature 1994, 371, 297-300. (33) Ladomery, M. Bioessays 1997, 19, 903-9. (34) Midmer, M.; Haq, R.; Squire, J.; Zanke, B. Cancer Res. 1999, 59, 4069-74. (35) Uchiumi, T.; Terao, K.; Ogata, K. J. Biochem. Tokyo 1980, 88, 1033-44. (36) Gavin, A.; Bosche, M.; Krause, R.; Superti-Furga, G. et al. Nature 2002, 415, 141-147. (37) Ho, Y.; Gruhler, A.; Heilbut, A.; Tyers, M. et al. Nature 2002, 415, 180-183. (38) Gygi, S.; Aebersold, R. Curr. Opin. Chem. Biol. 2000, 4, 489-494. (39) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P.; Fenselau, C. Anal. Chem. 2001, 73, 2836-42. (40) Uetz, P.; Giot, L.; Cagney, G.; Mansfield, T.; Judson, R.; et al. Nature 2000, 403, 623-627. (41) Ito, T.; Chiba, T.; Ozawa, R.; Yoshida, M.; Hattori, M.; Sakaki, Y. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4569-4574. (42) Caponigro, G.; Abedi, M.; Hurlburt, A.; Maxfield, A.; Judd, W.; Kamb, A. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 7508-13. (43) Norman, T.; Smith, D.; Sorger, P.; Drees, B.; O’Rourke, S.; Hughes, T.; Roberts, C.; Friend, S.; Fields, S. Murray, A. Science 1999, 285, 591-5. (44) Geyer, C.; Colman-Lerner, A.; Brent, R. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 8567-8572. (45) Peelle, B.; Lorens, J.; Li, W.; Bogenberger, J.; Payan, D.; Anderson, D. C. Chem. Biol. 2001, 8, 521-534.

PR0255147

Journal of Proteome Research • Vol. 1, No. 3, 2002 261