Proteomic Studies of the Intrinsically Unstructured Mammalian

Proteomic Studies of the Intrinsically Unstructured Mammalian Proteome. Charles A. ... Keywords: intrinsically unstructured proteins • disordered pr...
3 downloads 6 Views 571KB Size
Proteomic Studies of the Intrinsically Unstructured Mammalian Proteome Charles A. Galea,† Vishwajeeth R. Pagala,‡ John C. Obenauer,‡ Cheon-Gil Park,† Clive A. Slaughter,‡ and Richard W. Kriwacki*,†,§ Department of Structural Biology, St. Jude Children’s Research Hospital, 332 North Lauderdale Street, Memphis, Tennessee 38105, Hartwell Center for Bioinformatics and Biotechnology, St Jude Children's Research Hospital, 332 North Lauderdale Street, Memphis, Tennessee 38105, and Department of Molecular Sciences, University of Tennessee Health Sciences Center, Memphis, Tennessee 38163 Received July 5, 2006

Intrinsically unstructured proteins (IUPs) represent an important class of proteins primarily involved in cellular signaling and regulation. The aim of this study was to develop methodology for the enrichment and identification of IUPs. We show that heat treatment of NIH3T3 mouse fibroblast cell extracts at 98 °C selects for IUPs. The majority of these IUPs were cytosolic or nuclear proteins involved in cell signaling or regulation. These studies represent the first large-scale experimental investigation of the intrinsically unstructured mammalian proteome. Keywords: intrinsically unstructured proteins • disordered proteins • proteomics • mammalian proteome • heat denaturation

Introduction Bioinformatics analyses of whole genomes using disorder predictors indicate that 30-40% of all eukaryotic genes encode proteins that are either wholly disordered or contain lengthy disordered segments (>50 residues).1,2 Although the existence of flexible, bioactive polypeptides (e.g., polypeptide hormones3) has been known for many years, the fact that intrinsically unstructured proteins (IUPs) play broad roles in biology has only recently been recognized.4 These roles include regulation of cell division, transcription and translation, signal transduction, protein phosphorylation, storage of small molecules, chaperone action, and regulation of the self-assembly of large multi-protein complexes such as the ribosome.4-10 Indeed, bioinformatic analyses indicate that the majority of proteins involved in eukaryotic signal transduction are disordered or contain long disordered segments.11 Further, these studies have shown that 79% of human cancer-associated proteins (HCAPs) can be classified as IUPs, compared to 47% of all eukaryotic proteins in the SWISS-PROT database,11 a measure of the importance of IUPs in mechanisms of tumorigenesis. IUPs exhibit low sequence complexity, low hydrophobic amino acid content, and correspondingly high polar and charged amino acid content.12,13 These features are consistent with their inability to fold into globular structures and have led to the development of computer programs that predict * To whom correspondence should be addressed. Dr. Richard Kriwacki, Department of Structural Biology, St. Jude Children’s Research Hospital, 332 North Lauderdale St., Memphis, TN USA 38105. Tel.: +1 901 495-3290. Fax: +1 901 495-3032. E-mail: [email protected]. † Department of Structural Biology, St. Jude Children’s Research Hospital. ‡ Hartwell Center for Bioinformatics and Biotechnology, St Jude Children's Research Hospital. § University of Tennessee Health Sciences Center. 10.1021/pr060328c CCC: $33.50

 2006 American Chemical Society

(with up to 80% accuracy) disordered regions of proteins.12,14-18 NMR studies have shown that the degree of disorder for IUPs varies widely. Some IUPs completely lack secondary or tertiary structure19-28 whereas others, such as p27, contain partially populated secondary structure.9,10,29-36 Although unbound IUPs are disordered in solution, they often perform their biological functions by binding to other biomolecules. This binding involves a disorder-to-order transition in which IUPs adopt a highly structured conformation upon binding to their biological partners.32,33,37 In this way, IUPs play diverse roles in regulating the function of other biomolecules and in promoting the assembly of supra-molecular complexes. Further, because sites within their polypeptide chains are highly accessible, IUPs can undergo extensive posttranslational modifications, such as phosphorylation, acetylation, and/or ubiquitination (also sumoylation, neddylation), allowing for modulation of their biological activity or function. For example, phosphorylation of p27 at Thr187 promotes recognition by the ubiquitination machinery of the SCF (Skp1/ cullin/F-box protein), which leads to p27 ubiquitination and degradation by the 26S proteasome.38 Further, Akt-mediated phosphorylation of a residue within the nuclear localization signal of p27 in breast cancer cells prevents interactions with the nuclear import machinery and leads to cytoplasmic localization.39 p27, normally localized in the nucleus, encounters new targets in the cytoplasm and exhibits a gain of oncogenic function. In a further example, phosphorylation of p27 on Ser10 promotes its interaction with the shuttling protein, CRM1, leading to export from the nucleus.40 In addition to functioning through interactions with binding partners, some IUPs perform their functions by playing structural roles as molecular linkers, spacers, bristles, springs, and clocks.5 Journal of Proteome Research 2006, 5, 2839-2848

2839

Published on Web 09/21/2006

research articles Several previous studies indicated that IUPs are stable to heat denaturation. For example, it has been shown that the solubility and limited secondary structure of the two IUPs, p21 and p27, are virtually unaltered by heating to 90 °C.32,33,41-44 It is likely that resistance to thermal aggregation stems from the low mean hydrophobicity and high net charge characteristic of these proteins. Heat denaturation has been used in several studies for the purification of recombinant IUPs.43,45-47 Two recent studies have described methods for the enrichment and proteomic identification of IUPs from prokaryotic and yeast cell extracts.48,49 In the first study, Cortese and coworkers48 enriched an Escherichia coli cell extract for IUPs using a novel acid treatment strategy. This method was based on the finding that many proteins that fail to precipitate during perchloric acid or trichloroacetic acid treatment were IUPs. In the second study, Csizmok and co-workers49 utilized heat treatment coupled with a novel 2-D gel methodology to identify IUPs in cell extracts from E. coli and Saccharomyces cerevisiae. This method uses a combination of native and 8 M urea electrophoresis of heat-treated proteins where IUPs run on or close to the diagonal of the 2-D gel, whereas folded globular proteins either precipitate upon heat treatment or unfold and run off the diagonal in the second dimension. Although this method works reasonably well for the analysis of IUPs in the relatively small E. coli and S. cerevisiae proteomes, it is not practical for the analysis of large numbers of IUPs that are predicted to be present in mammalian proteomes. Here, we present the first large-scale analysis of IUPs isolated from the proteome of mammalian cells (NIH3T3 mouse fibroblasts). Proteins were classified as IUPs, intrinsically folded proteins (IFPs), or mixed ordered/disordered character (MPs) using the disorder prediction program PONDR (http:// www.pondr.com/). Proteins having a global average PONDR score > 0.5 were classified as IUPs. In addition, proteins having a global average PONDR score of 0.32-0.5 and possessing a high mean net charge and low mean hydrophobicity were also classified as IUPs. Proteins having a global average PONDR score < 0.32 were classified as IFPs. In addition, proteins having a global average PONDR score of 0.32-0.5 and possessing a low mean net charge and high mean hydrophobicity were also classified as IFPs. Proteins that did not meet the criteria set for IUPs or IFPs were defined as MPs. Using these criteria, we show that heat treatment can be used for the efficient enrichment of IUPs and depletion of proteins containing folded globular domains. We demonstrate that this methodology selectively enriches cytosolic and nuclear IUPs involved in cellular signaling and regulation.

Material and Methods Cell Culture. Mouse NIH3T3 fibroblast cells were maintained in Dulbeccos modified Eagles media (DMEM) supplemented with 10% fetal bovine serum and 2 mM glutamine. Cells were grown at 37 °C in a humidified incubator with a 5% CO2 atmosphere. For large-scale experiments, cells were grown on 20 cm × 20 cm plates that yield approximately 1 × 107 cells at 80% confluence. Intrinsically Unstructured Protein (IUP) Enrichment. NIH3T3 cells (8 × 107) grown to approximately 80% confluence were washed three times with cold PBS buffer (1.5 mM KH2PO4 and 2.7 mM Na2HPO4, pH 7.2, containing 155 mM NaCl), harvested with a cell scraper, and suspended in 200 mL of cold PBS. The cells were washed a further two times in cold PBS and then resuspended in 1 mL of Buffer A (10 mM sodium 2840

Journal of Proteome Research • Vol. 5, No. 10, 2006

Galea et al.

phosphate buffer, pH 7.0, 50 mM NaCl, 50 mM DTT, 1 × protease inhibitor cocktail (Roche Diagnostics, Indianapolis, IN) and 0.1 mM sodium orthovanadate). The cell suspension was homogenized using a Kontes pellet pestle homogenizer (Fisher Scientific, Hampton, NH) and then centrifuged at 16 000 × g in a benchtop Eppendorf 5415 C centrifuge (Eppendorf, Westbury, NY) at room temperature for 30 min. The supernatant was transferred to a fresh tube, diluted to a protein concentration of approximately 1 mg/mL with Buffer A, and heated at 60 °C for 10 min or 98 °C for 1 h. Following heating, the protein mixture was placed on ice for 15 min and then spun at approximately 16 000 × g (Eppendorf 5415 C) for 15 min at room temperature to pellet aggregated proteins. Soluble proteins in the supernatant were TCA/acetone precipitated, and the pellet was stored at -80 °C for further analysis. Limited Trypsin Digest of IUP Enriched Cell Extracts. Heattreated (98 °C for 1 h) NIH3T3 cells extracts were prepared in the absence of DTT and protease inhibitors as described above. To two tubes containing 350 µL of heat-treated (98 °C for 1 h) NIH3T3 cell extract (0.88 mg/mL protein), which had not been TCA/acetone precipitated, was added either 4.0 µL of 0.1 µg/ µL trypsin (Promega, Madison,WI) prepared in 50 mM acetic acid or 4.0 µL of acetic acid alone. The mixtures were incubated at room temperature for 1 min, and the reaction was terminated by the addition of 100 µL of a solution containing 3.5 mM PMSF and 8.0 × protease inhibitor cocktail. Proteins in the supernatant were TCA/acetone precipitated, and the pellets were stored at -80 °C for further analysis. Two-Dimensional (2-D) Gel Electrophoresis. The TCA/ acetone precipitated pellet containing a total of 156 µg of protein was dissolved in 300 µL of rehydration buffer (8 M urea, 2 M thiourea, 2% CHAPS, 50 mM DTT, 0.2% Bio-lyte ampholytes (Bio-Rad Laboratories, Hercules, CA) and 0.001% bromophenol blue). Isoelectric focusing was performed with immobilized pH gradient (IPG) strips (pH 3-10, 17 cm linear gradient; Bio-Rad Laboratories). IPG strips were rehydrated by overnight incubation with sample at room temperature. Isoelectric focusing was performed for approximately 7 h until 40 000 V h was achieved. Isoelectric-focused strips were equilibrated in 375 mM TrisHCl (pH 8.8), 6 M urea, 2% DTT, 2% SDS at room temperature for 15 min and then in the same buffer containing 2.5% iodoacetamide for a further 15 min. The strips were then transferred to a 1-mm thick SDS polyacrylamide gel (10%) and sealed in place using 2% low-melting temperature agarose gel. SDS-PAGE was performed at 25 mA for 4.5 h, and gels were stained with SYPRO Ruby dye (Molecular Probes Inc., Eugene, OR). Gel Image Analysis. Gel images were analyzed using Progenesis Workstation (Nonlinear Dynamics, Newcastle, England) or the web-based Ludesi 2D interpreter software (Ludesi AB, Lund, Sweden, http://www.ludesi.com). Spot detection was manually edited for noise and missing spots. Spot volumes were determined by measuring the integrated intensity for each spot, followed by background correction and normalization. Normalization removed systematic gel intensity differences, due to varying protein loading, staining, and scanning time of individual gels, by mathematically minimizing the median intensity differences between gel pairs. Preparation of Enzymatic Digests and Mass Spectrometric Analysis. Proteins of interest were excised with a ProPic Spot Picker (Genomic Solutions, Inc., Ann Arbor, MI), and digested with sequencing-grade, modified trypsin-supplied frozen (Prome-

research articles

Intrinsically Unstructured Mammalian Proteome

ga Corp., Madison WI) using a ProGest digestion station (Genomic Solutions, Inc., Ann Arbor, MI). Peptides released from gel plugs were then extracted, purified using C18 ZipTips (Millipore Corp., Bedford, MA), and spotted onto targets for mass spectrometry using a ProMS spotting robot (Genomic Solutions, Inc.). Mass spectrometric analysis was performed using a Model 4700 Proteomics Analyzer (Applied Biosystems, Foster City, CA) operated in reflector mode. This instrument employs matrix-assisted laser desorption/ionization (MALDI), in conjunction with tandem time-of-flight (TOF) mass analyzers. R-Cyano-4-hydroxycinnamic acid was used as matrix. Samples were applied in a 5 mg/mL matrix solution containing 2 mM ammonium citrate to suppress signals from matrix clusters. A timed ion selector operating at a resolution setting of 200 was employed for data-dependent selection of precursor ions for collision-induced dissociation. The mass scale was calibrated using both Plate Calibration and Default Calibration routines using a 4700 standard peptide mixture spanning the m/z range 900-3700. Spectral Search Parameters and Protein Identification Criteria. Database searches were performed with the MASCOT v2.1 search engine (Matrix Science Ltd., London, England).50 Protein assignments were made by combining MS and MS/ MS spectral matches. Searches were conducted against all entries in the nonredundant NCBInr database (1st August 2005 to 9th January 2006). A precursor ion m/z accuracy of 150 ppm was selected. Protein assignments with a MASCOT Protein Score of 77 or better and having both MS and MS/MS data were recorded. In addition to the 915 proteins meeting these criteria, a further 42 proteins were included. Thirty-six of these additional proteins were identified on the basis of mass spectral data that failed to meet the Protein Score threshold on some gels but exceeded it on others. The remaining 6 proteins were identified on the basis of mass spectral data that exceeded the target Protein Score of 77 but did not include MS/MS data. See the Supporting Information for specific details regarding the analysis of mass spectral data. Bioinformatics Analysis of Protein Disorder and Database Searching. A Perl script was used to retrieve protein sequences from the nonredundant NCBI nr database or a local database using GI number identifiers. Each residue in the sequence was scored for its average level of disorder using the commercially available program PONDR (Molecular Kinetics, Inc., Indianapolis, IN). For each protein, the number of 50-residue segments with average PONDR scores > 0.5 was determined using a 10-residue sliding window. The lengths of sequences meeting this criterion were tabulated. Charge-hydropathy (CH) values were also calculated from protein sequences using PONDR. IUPs usually lie on the lefthand side of a line defined by the equation R ) 2.785 × H 1.151 on a plot of mean net charge (R) versus mean hydrophobicity (H) whereas globular proteins normally reside on the right-hand side of the plot.17 This is because IUPs tend to have higher mean net charge and lower mean hydrophobicity when compared with globular proteins. Proteins were classified as disordered (IUPs), folded globular proteins (IFPs; intrinsically folded proteins), or mixed ordered/disordered character (MPs) according to the following criteria. Proteins having a global average PONDR score > 0.5 were classified as IUPs. In addition, proteins having a global average PONDR score of 0.32-0.5 and which resided more than 0.0065 units on the H axis to the left of the boundary line on the CH plot were also classified as IUPs. Proteins having a global average PONDR score < 0.32 were

classified as IFPs. In addition, proteins having a global average PONDR score of 0.32-0.5 and which resided more than 0.0065 units on the H axis to the right of the boundary line on the CH plot were also classified as IFPs. Proteins which did not meet the criteria set for IUPs or IFPs were defined as MPs. The results of the automated analysis were stored in a MySQL database and accessed through a web interface written in PHP. The web interface displayed protein identifications and PONDR analysis results. Data could be sorted according to charge, hydropathy, average PONDR score, and other parameters to facilitate manual analysis. The molecular weights and pI values for each protein were calculated based on their primary amino acid sequence using the Expasy web server (http://ca.expasy.org/tools/pi_tool.html). It should be noted that the actual molecular weight and pI values may differ due to post-translational modifications. SwissProt (http://ca.expasy.org/sprot/sprot-top.html) and the National Center for Biotechnology Information (NCBI; http:// www.ncbi.nlm.nih.gov/entrez/query.fcgi) protein and literature databases, as well as the mouse protein subcellular localization database LOCATE (http://locate.imb.uq.edu.au/) and the gene database GENATLAS (http://www.dsi.univ-paris5.fr/genatlas), were used to identify known sites of post-translation modification and to classify proteins according to their currently known function and subcellular location. The RCSB Protein Data Bank (http://pdbbeta.rcsb.org/pdb/Welcome.do) was searched for currently available protein three-dimensional structures. The disordered protein database DisProt (http://www.disprot.org/) was searched to identify proteins having experimentially characterized disordered regions.

Results Resistance of Intrinsically Unstructured Proteins to Heat Treatment. We initially explored the possibility of enriching IUPs in cell extracts by using heat treatment and other denaturating conditions (e.g., organic solvent and low pH) to eliminate heat-sensitive, globular proteins. NIH3T3 protein extracts heated at a variety of temperatures were analyzed by SDS-PAGE to determine the extent of protein precipitation under these conditions (Supplemental Figure 1, Supporting Information). Results showed that the amount of soluble protein decreased as the incubation temperature was increased. The most abundant proteins remaining soluble following heat treatment migrated in the low molecular weight range. However, minor components in the range from 30 to 200 kDa (80 and 90 °C, soluble lanes in Supplemental Figure 1) were also observed. Because many IUPs function by binding to other globular proteins, it was possible that IUPs may precipitate upon denaturation of their binding partners. To examine this possibility, we subjected a mixture containing the IUPs, p27, and the acidic domain of Hdm2 (∆Hdm2) and the IFPs, bovine serum albumin (BSA), T160-phosphorylated cyclin-dependent kinase 2 (pCdk2), cyclin A, and lysozyme to a variety of denaturing conditions (Supplemental Figure 2A, Supporting Information). p27 is known to form a highly stable ternary complex with pCdk2 and Cyclin A.51 All components except lysozyme and p27 were precipitated at pH 2, including ∆Hdm2, which probably precipitated due to its low pI (pI ) 3.2). Treatment with organic solvent (10% n-butanol) or heating at 60 °C for 10 or 60 min resulted in only partial precipitation of cyclin A and pCdk2. However, heat treatment at 80 °C for 10 or 60 min resulted in complete precipitation of the globular Journal of Proteome Research • Vol. 5, No. 10, 2006 2841

research articles

Galea et al.

Figure 1. 2-D gel analysis of proteins from untreated and heat-treated NIH3T3 cell extracts. (A) 2-D gels for cell extracts incubated at 4 °C or heat treated at 60 °C for 10 min or 98 °C for 1 h. Proteins (156 µg) were loaded onto pH 3-10 IPG strips (17 cm linear gradient) and resolved by IEF as outlined in the Experimental Procedures section. SDS-PAGE was performed with 10% gels. Spots were visualized by SYPRO ruby staining. (B) Total number of gel spots picked and proteins identified for the 2-D gels shown in (A).

proteins whereas >95% of both IUPs remained in solution. We recently showed that the thermal denaturation temperature (Tm) for the p27/pCdk2/cyclin A ternary complex was 73 °C.51 Our present results indicate that at 60 °C, the ternary complex remains intact and completely soluble, whereas at 80 °C p27 is released from the ternary complex prior to precipitation of pCdk2 and cyclin A (Tm for the pCdk2/cyclin A binary complex is 52 °C). These results suggest that IUPs bound to other proteins in multiprotein complexes may be recovered when heating is used for IUP enrichment. To test whether IUPs behave in a similar manner in whole cell extracts, we treated NIH3T3 extracts by reducing the pH to 2, adding 10% n-butanol, or heating at 60 °C or 80 °C for 10 or 60 min (Supplemental Figure 2B). Immunoblot analysis of the treated extracts with Cdk2 and p27 antibodies confirmed that p27 and Cdk2 exhibited a solubility pattern similar to that observed for the simple protein mixture. Notably, heat treatment at 80 °C resulted in the precipitation of Cdk2 while p27 remained in solution. These results indicate that heat treatment may be used to recover both complexed and uncomplexed IUPs in a general IUP enrichment strategy. 2-D Gel Analysis of Heat-Treated NIH3T3 Cell Extracts. 2-D gel electrophoresis was used to examine the effects of heat treatment upon cell extracts at the proteome level (Figure 1A). Gel spots were excised, treated with trypsin and the resulting tryptic peptides were analyzed by MALDI-TOF/TOF mass spectrometry. From a total of 584, 472, and 269 spots, 375, 388, and 198 proteins (287, 304, and 124 nonredundant proteins) were identified on 2-D gels for cell extracts treated at 4, 60, and 98 °C, respectively (Figure 1B; Supplemental Tables 1 to 3, Supporting Information). Although a similar number of proteins were identified on the 4 and 60 °C 2-D gels, there were fewer protein spots after treatment at 98 °C. The relative intensities of spots varied among all three gels, indicating enrichment or depletion of various proteins in the cell extract. Nonredundant protein identifications were selected for further bioinformatics analysis (Supplemental Tables 4-6, Supporting Information). Heat Treatment Enriched for Proteins with High Net Charge and Low Hydrophobicity. We analyzed primary struc2842

Journal of Proteome Research • Vol. 5, No. 10, 2006

tures of proteins identified from cell extracts treated at different temperatures to determine whether they exhibited the physiochemical characteristics of IUPs (i.e., high net charge and low hydrophobicity). The results indicated that heat treatment of NIH3T3 cell extracts leads to an enrichment of proteins possessing high net charge (mean residue net charge > 0.025) and/or low hydrophobicity (mean residue hydrophobicity < 0.45) characteristic of intrinsically disordered proteins (Figure 2A and B). These proteins were distributed over a similar range of molecular weight values (approximately 10-200 kDa) (Figures 1A and 2C,D). However, there was a bias toward acidic pI values (