currents
Phosphoprotein ICAT In mammalian cells, protein phosphorylation is the main mechanism of signal transduction. But although up to one-third of the total proteome might be phosphorylated, the absolute levels of any single protein species might be very low. Traditional protein isolation using 2D-PAGE suffered from the need for 32P-radiolabeling and poor resolution of low-abundance proteins. Similarly, nonspecific interactions with immunoaffinity or metal affinity resins led to the isolation of nonphosphorylated proteins. In either case, isotopic labeling was Phosphopeptide enrichm ent.Chromatographs are shown for trypsin digests of (A) β-casein and (B) required for quantitation. Thus, methods PhIAT-labeled β-casein enriched by affinity chromatography with immobilized avidin. (Adapted from were developed for the selective enrich- Goshe, M. B.; et al. Anal. Chem. 2002,74, 607–616.) ment of phosphorylated proteins and pepphase LC-MS/MS and identified using automated database tides (phosphoproteins and phosphopeptides). searching (Sequest). Recently, Michael Goshe and his colleagues at the Pacific PhIAT labeling enriched the samples for phosphopepNorthwest National Laboratories (Richland, WA) combined tides, as shown in the figure, and Sequest analysis of the mass the phosphoprotein modification protocols of Yoshiya Oda spectra allowed clear identification of several peptides. When (Nat. Biotechnol. 2001, 19, 379–382) with isotope-coded affinsamples labeled selectively with EDT-D0 or EDT-D4 were comity tag (ICAT) technology to isolate phosphoproteins from a bined, one peak was shifted by 4 Da relative to its partner, and solution of commercially available β-casein, creating a the peak intensities were stoichiometric with the relative method that they called phosphoprotein ICAT or PhIAT (Anal. amount of species in the mixture. For example, if ratio of EDTChem. 2002, 74, 607–616). D0 sample to EDT-D4 sample was 5:1, the peak intensity of the The researchers removed the protein phosphate groups lighter peptide was ~5 times greater. Finally, even though the using hydroxide-mediated β-elimination, producing thiolateβ-casein was >95% pure, the researchers were able to unamreactive sites that were then isotopically labeled with either biguously detect and identify phosphopeptides from two α1,2-ethanedithiol (EDT-D0) or ethane-d4-1,2-dithiol (EDT-D4). The labeled proteins were further modified with iodoacetylcaseins and less confidently from κ-casein. PEO-biotin, which added a biotin moiety. The proteins were Although the researchers expressed concern over secdigested with trypsin, and the labeled peptides were isolated ondary reaction sites, such as cysteine residues, they are conby affinity chromatography with immobilized avidin. The fident that the method can be improved to limit such secenriched peptides were then analyzed by capillary reversedondary effects.
Genom ic Signals of Protein Interactions Protein interaction maps of model organisms have allowed researchers to elucidate metabolic pathways and identify regions of cross-talk between otherwise unlinked pathways. But the generation of these maps using twohybrid screening of random libraries is costly and laborintensive. Given the wealth of genomic data that exists for a number of species, it should be possible to use the interaction maps of one organism to identify potential interactions between the orthologous pro© 2002 American Chemical Society
teins of a second organism. This was the premise behind the recent efforts of Lisa Matthews and her colleagues (Genome Research 2001, 11, 2120– 2126).
Performing a BLASTP search of AceDB, a database of Caenorhabditis elegans sequences (www.acedb.org), the researchers compared the predicted sequences of pro-
Sam ple oftw o-hybrid results Yeastprotein pairs
Interaction
W orm interologs
Interaction
LSM4
LSM1
N
F32A5.7
F40F8.9
N
TEM1
SMX3
Y
C39F7.4
ZK652.1
N
CKA2
CKB2
Y
B0205.7
T01G9.6A
Y
SNF4
GAL83
Y
F55F3.1
Y111B2C.H
N
LSM4
LSM7
N
F32A5.7
ZK593.7
Y
(Adapted with permission. Copyright 2001 Genome Research)
teins from the worm to those found in two protein interaction maps of the yeast Saccharomyces cerevisiae, hoping to identify evolutionarily conserved protein–protein interactions or “interologs”. From the 1195 yeast interactions, the researchers identified 257 potential worm interologs of which 216 were screened using two-hybrid methods. They also tested 71 of the mapped yeast interactions and detected 19 (26%). Of these 19 interactions, six (31%) also occurred in the worm, and one even occurred
Journal of Proteome Research • Vol. 1, No. 1, 2002 11
12 Journal of Proteome Research • Vol. 1, No. 1, 2002
currents in the worm that was undetected in yeast. Combined with the screening results of the remaining worm interologs, 16% of the interologs exhibited interactions. Thus, the researchers concluded that the minimal proportion of interologs detectable for two organisms 900 million years apart on the evolutionary tree is between 16% and 31%. When these results are compared to those expected from the two-hybrid screening of a random worm library, the interolog-based method is 600 to 1100 times more efficient. The directed method also allows the identification of interactions that might be swamped in a random test by proteins that are highly expressed.
Low Tem p Im proves M S MALDI-TOF MS is indispensable for the characterization and identification of proteins. The technique works well with soluble proteins, but hydrophobic proteins, which tend to aggregate in polar solvents, present a problem. This situation is aggravated by the fact that many overexpressed proteins form insoluble inclusion bodies. The aggregates can be broken up with organic solvents or detergents, but these reagents can damage or modify some proteins. To get around this problem, Gregory Bird and colleagues explored the use of temperature shifts to improve the solubility of hydrophobic proteins and their cocrystallization with MS matrices (Anal. Chem. 2002, 74, 219–225).
Probing forProtein Surfaces Most, if not all, important functions of a protein occur at its surface, where it is solvent accessible. Thus, identifying solvent-exposed regions is invaluable to researchers trying to develop drugs to antagonize or promote specific protein actions. One way to determine this information is by studying the effects of paramagnetic compounds on the NMR parameters of a protein, especially its proton relaxation characteristics. Perhaps the most popular such agent is 4-hydroxy2,2,6,6-tetramethylpiperidine-1-oxyl (TEMPOL). Unfortunately, whereas the ideal paramagnetic agent for this purpose would interact nonspecifically with the protein surface, this is not always the case with TEMPOL. Thus, Guido Pintacuda and Gottfried Otting of Stockholm’s Karolinska Institute recently examined the suitability of another paramagnetic compound, Gd (III)-diethylenetriamine pentaacetic acid-bismethylamide or Gd(DTPA-BMA), to examine the solvent-exposed regions of ubiquitin (J. Amer. Chem. Soc. 2002, 124, 372–373). Like TEMPOL, Gd(DTPA-BMA) is uncharged and highly water-soluble, but it also has a much lower hydrophobicity than its counterpart, decreasing the likelihood of nonspecific interactions with the protein. Its higher paramagnetism also means that less of the compound is required to see the same degree of relaxation. Finally, Gd(DTPABMA) is stable over a wide pH range and
The researchers studied the bZIP fragment of the transcription factor GCN4. The 60residue polypeptide works as a dimer, and mutants bearing alanine substitutions are poorly soluble. Thus, to maintain protein solubility at lower urea concentrations, the researchers adapted Donald Wetlaufer’s temperature-modulating T-leap method (Prot. Sci. 1996, 5, 517–523) of refolding intractable proteins. Bird and his colleagues incubated a solution of denatured protein and the MS matrix—the final urea concentration was ~400 mM—at 4 °C overnight, and then heated the combination at 37 °C for one hour before spotting onto a sample plate. The plate was then air-dried and mass spectra were obtained. Without the temperature
shift, mass spectra were not obtained for any of the bZIPs, even though soluble marker proteins in the same samples did exhibit spectra. The researchers attribute the lack of mass spectra to bZIP aggregation, which prevents the required cocrystallization of the protein with the matrix. After preincubation at 4 °C and heating to 37 °C, however, spectra were obtained for each of the bZIPs. The researchers offer the possible explanation that during cold preincubation, the matrix forms a suspension of small crystals to which the protein adheres and concentrates, keeping the bZIP from aggregating. Wetlaufer called this concept slow crystallization. This result suggests that the T-leap method is readily
against redox-active compounds. The lack of binding by Gd(DTPA-BMA) to the protein was indicated by the fact that, in the presence of the agent, ubiquitin proton resonances, while attenuated, were not lost, nor were they significantly shifted. Using the published NMR structure of ubiquitin as a template, the researchers calculated the T1 relaxation rates (R1) of protons that were expected to be accessible to Gd(DTPABMA). They then compared these rates with those determined experimentally from a solution of 2mM ubiquitin in the presence and absence of 4mM Gd(DTPA-BMA) and found a strong correlation. The researchers then repeated the experiment with 25mM TEMPOL, which gives the same relaxation enhancement as 4mM Gd(DTPA-BMA), and found a weaker correlation. This they attribute to transient specific interactions between the protein and TEMPOL. Together, the many beneficial features of Gd(DTPA-BMA) bode well for future success in the identification of protein surface regions.
Com paring relaxation rates.(a) There is a strong correlation between the experimental and calculated T1 relaxation rates (R1) of ubiquitin protons in the presence of 4mM Gd(DTPA-BMA). (b) The correlation is not nearly as strong, however, for the protons in the presence of 25mM TEMPOL (filled circles) as compared to those in the presence of 4mM Gd(DTPA-BMA) (open circles). (Adapted from Pintacuda, G.; Otting, G. J. Amer. Chem. Soc. 2002,124, 372–373.)
Journal of Proteome Research • Vol. 1, No. 1, 2002 13
currents modifications or whether the modifications would survive the fragmentation process. To address these questions, Stephenson’s group performed whole protein MS analysis on ribonuclease A and ribonuclease B, two proteins that differed solely in the attachment of one N-linked sugar—an ideal test of the system (Anal. Chem. 2002, 74, 577–583). Mass spectra of the unglycosylated ribonuclease A showed that the protein fragmented at the same preferred sites as had the proteins studied earlier. This also held true for the glycosylated ribonuclease B. The only place where the two spectra differed was in the migration of the sugarbearing peptide, and the difference equaled the molecular mass of the sugar moiety. With post-translational modification a central pillar of cell biology, these results are another step toward the analysis of any proteome.
A
B
One T-leap forprotein kind.(A) In the absence of a refolding temperature shift, the bZIP protein shows no mass spectra, even though the marker proteins are clearly evident. (B) After the refolding temperature shift, however, the bZIP protein (labeled 4A) is clearly present. (Adapted from Bird, G. H.; et al. Anal. Chem. 2002,74, 219–225.)
adaptable not only for the refolding of proteins that are prone to aggregation, but also for achieving mass spectra from insoluble proteins.
Glycoprotein M ass Spec A complicating aspect of proteomic analysis is that whereas the genome involves a relatively static array of DNA sequences, the proteome is composed of a finite number of proteins that can be modified in a seemingly infinite number of ways. MS has proven useful in the study of protein glycosylation but suffers from the poor ionization of glycosylated peptides compared to their unmodified forms, while episodes of artifactual gas-phase deglycosylation can make finding the
site of modification difficult. James Stephenson and his colleagues at Purdue University (West Lafayette, IN) and Oak Ridge National Laboratory (TN) have had some success in recent years using a “top down” approach to protein sequence analysis using MS (Anal. Chem. 1998, 70, 3533–3544). Rather than extensively purify a protein and digest it with proteases, Stephenson’s group subject whole proteins to tandem MS, finding that the proteins tend to fragment preferentially at sites such as C-terminal of aspartic acid, lysine, arginine, and histidine and N-terminal of prolines. But no one had determined how the fragmentation site preferences were affected by post-translational
14 Journal of Proteome Research • Vol. 1, No. 1, 2002
similarities between proteins that previously were amenable only to direct structural comparison and that are critical for making predictions that direct the experimental study of protein functions. Sometimes, however, seemingly erroneous computational predictions seem to be supported by experiment. Eugene Koonin and his colleagues analyzed six cases where novel and conventional computational methods led to nontrivial predictions that were subsequently supported by direct experiments (Genome Biol. 2001, 2, 0051.1– 0051.11). In all six cases, the original prediction was unjustified, and in at least three cases an alternative and wellsupported computational prediction, incompatible with the original, was derived. One of the more unusual cases involved the identification of an archaeal cysteinyltRNA synthetase in Methanococcus jannaschii. Using Experim entalValidation of sequence-profile analysis, mulErroneous Com putations tiple alignment, and secComputational methods exondary-structure prediction, tract information from multhe researchers identified the tiple alignments to construct unique enzyme as a homolog various types of sequence proof extracellular polygalactosfiles that are then used for iteraminidases. ative database searching, such In each case, the original as PSI-BLAST and Hidden computational predictions Markov Model approaches. could be refuted, and in some These computational methinstances, strongly supported ods have substantially imalternative predictions were proved the detection of subtle obtained. The nature of the experimental evidence that appears to support these predictions remains an open question. Some of these experiments might signify discovery of extremely unusual forms of the respective enzymes, the researSugar,sugar.Mass spectra of ribonuclease A (top) chers said, whereand ribonuclease B, indicating the changing peak as the results of pattern due to glycosylation. (Adapted from Reid, others could be G.E.; et al. Anal. Chem. 2002,74, 577–583.) due to artifacts.