currents
Visible ICAT
cysteines in a complex sample, such as a cell Michael Gelb of the UniAuthentic Authentic Protein mixture lysate. versity of Washington, peptide peptide with target protein The researchers Ruedi Aebersold of the used the VICAT reInstitute for Systems Bi1. Denature protein and agents with isoelectric ology, and colleagues reduce cysteines Tag with Tag with 2. Tag cysteines with 14C-VICAT (–28) 14C-VICAT (+6) focusing (IEF) on a have developed a new SH SH VICATSH commercial gel strip to generation of isotope3. Trypsin digest proteins determine the absolute coded affinity tag (ICAT) IEF marker abundance of the proreagents for determining Internal standard VICATSH-tagged peptide mixture tein human group V the relative abundance of (hGV) phospholipase proteins in complex mixA2 in cell lysates. Foltures. Called visible ICAT 1. IEF on a gel strip (VICAT), the new class lowing IEF, the precise 2. Locate IEF marker and of reagents have an position of the labeled elute peptides additional probe that peptides was revealed 3. Capture on streptavidin– agarose and photocleave allows tagged peptides by radiolabeled IEF to be visualized. This markers. The peptides Micro-LC/ESI-MS/MS to visible moiety provides a were eluted from the quantify light (sample-derived) way to track the chromastrip, purified by affinity and heavy (internal standard) tographic location of chromatography with tagged peptides target peptides without streptavidin–agarose the use of MS. and photocleavage with Visualize it. The addition of a radiolabeled IEF marker to a mixture containing Just like the original near-UV light, and anasample-derived peptides and an internal standard provides a way to find the exact ICAT reagents, the new lyzed by micro-LC/eleclocation of a peptide in the gel strip without using MS. VICAT reagents selectrospray ionization MS tively tag peptides that in selected reaction contain cysteine. The VICAT reagents also contain a photomonitoring mode. Human lung macrophages were found to cleavable linker to remove the tag, a biotin-affinity handle, contain 66 fmol of the hGV protein per 100 µg of cell protein. and a tertiary amine, which ensures good solubility in aqueThe VICAT analysis was at least an order of magnitude ous samples. Solubility becomes more important as the more sensitive than the Western blot, which gave inconclumolecular weight of the protein-tagging reagent increases, sive data for hGV in macrophages. (Anal. Chem. 2004, 76, because millimolar levels are often needed to label all of the 4104–4111)
Fewer incorrect peptide identifications Often, researchers base a peptide identification on a single tandem mass spectrum. But is that good enough? John Yates and John Venable at the Scripps Research Institute have found wide variability between tandem mass spectra generated from the same peptides on an ion-trap MS instrument. Such variability can cause incorrect peptide assignments. To reduce the number of false positives, Yates and Venable have developed methods for processing tandem MS data. The researchers produced 1000 replicate tandem mass spectra for two peptide standards. The results were searched in a database using three algorithms: Sequest, Pep_Probe, and Mascot. Several incorrect identifications were generated, and relative standard deviations were © 2004 American Chemical Society
7.3–20.1%, depending on the search algorithm. Spectra were also evaluated across a range of peptide concentrations with Sequest alone. Low-intensity ions were not reproducibly present, and changes in ion intensity were observed in the spectra, both of which may account for some of the variability in Sequest scores. Two processing procedures were tested on the data. Spectral processing, which is an averaging method used prior to database searching, reduced the number of incorrect identifications by half for most of the concentrations. A moving average smoothing method was also evaluated on both unprocessed and spectral averaged data. When this method was applied to both data sets, fewer false positives were observed. A five-protein mixture was digested and analyzed by tandem MS, and un-
processed and processed data were compared. All five proteins were correctly identified with the three methods, but the number and quality of peptide assignments varied. Although the moving average smoothing method generated the fewest peptide identifications, it also produced the fewest false positives. (Anal. Chem. 2004, 76, 2928–2937)
Four new tools for shotgun proteomics Natalie Ahn, Katheryn Resing, and colleagues at the University of Colorado have developed three programs and a new database to reduce the number of false positives and negatives in shotgun proteomics experiments. Compared with Sequest or Mascot alone, the new approach has fewer false positives (4.2%) and false negatives (8%).
Journal of Proteome Research • Vol. 3, No. 4, 2004
687
currents
TOOLbox Googling proteins William Noble and colleagues at NEC Laboratories America, the Max Planck Institute for Biological Cybernetics (Germany), Columbia University, and the University of Washington Health Sciences Center have developed an algorithm that searches protein databases in a way that is similar to Google Web searches. Including global network information in searches has been shown to increase the number of correct protein identifications. The new RANKPROP algorithm uses a precomputed protein similarity network that includes information about global network structure. Although the algorithm generates data based on information from PSIBLAST, a common search program, RANKPROP has fewer false positives and a higher rate of correct identifications. (Proc. Natl. Acad. Sci. USA 2004, 101, 6559–6563)
Protein interactions database David Eisenberg and colleagues at the University of California, Los Angeles, have developed a new database that contains protein interaction data for 83 organisms. Called Prolinks, the database contains 515,892 interactions determined by four algorithms that use evolutionary relationships among genes and among proteins to infer functional similarities. Protein interactions predicted by the Prolinks database are represented graphically within the Protein Navigator Web browser, which was developed by the same researchers. Unlike other coevolutionary databases, such as Predictome and String, Prolinks combines information from four algorithms, not two or three. It also includes a statistical measure for evaluating the accuracy of each predicted interaction and reports only those potential interactions between proteins that are present in the same organism. In a comparison with String, Prolinks identified nine additional proteins that interact with an E. coli protein involved in ATP synthesis. (Genome Biol. 2004, 5, R35)
688
Typically, scientists searching a database with an algorithm will set a particular value as a threshold based on a search through a randomized database. Peptide assignments above the threshold value are considered to be correct. But according to Ahn, Resing, and colleagues, many valid assignments fall below the threshold value. By using 18 standard proteins and manual validation, in addition to other methods, the researchers found that more than half of the assignments below the threshold should be identifiable. Because Sequest and Mascot, two popular search algorithms, sometimes yield conflicting scores for the same assignments, the researchers developed three new programs for data analysis. MSPlus uses scores from both Sequest and Mascot to determine whether an assignment is correct. Isoform Resolver compiles the peptide sequences obtained from MSPlus into a protein profile that reduces false positives that arise from the assignment of incorrect protein isoforms. Another program compares spectra and scores them on the basis of similar fragmentation patterns, parent ion masses, and other chemical parameters. Multiple proteins can contain identical peptides, which often leads to incorrect identifications. Ahn, Resing, and colleagues have developed a new database that consists of unique peptide sequences and lists all the proteins that are associated with each peptide. Thus, all protein possibilities are considered in the analysis. (Anal. Chem. 2004, 76, 3556–3568)
Antibody arrays for protein modifications Properties of proteins are regulated by posttranslational modifications (PTMs) such as phosphorylation, acetylation, and ubiquitination. In order to track PTMs, Y. Eugene Chin and colleagues at Brown University Medical School developed antibody arrays that can identify PTMs in whole-cell lysates. (a)
(b)
Antibody array. Whole-cell extracts from (a) untreated and (b) epidermal growth factor-treated cells were incubated on antibody arrays to detect changes in protein phosphorylation. (Adapted with Permission. Copyright 2004 American Society for Biochemistry and Molecular Biology.)
Prior to this work, most assays were based on in vitro PTMs. The arrays created by Chin and colleagues permit analysis of PTMs as they occur inside cells in response to a signal, like a growth factor or a drug, that triggers or stops a particular PTM. Chin and colleagues immobilized antibodies onto polyvinylidene fluoride and glass slides to form the arrays. To detect the proteins bound to the array, the proteins in the cell lysates were labeled with a fluorescent dye. After the cell lysate was incubated with the antibody array, the captured proteins were visualized by a near-IR fluorescent scanner or fluorescence microscopy. The investigators used the Cancer proteomics screens antibody arrays to profile Daniel Jay and colleagues at Tufts University School of three types of PTMs in differMedicine, the National Cancer Institute, and Xerionent cell types. They studied Pharmaceuticals AG (Germany) conducted two proepidermal cancer cells to teomics screens to find proteins involved in tumor track tyrosine phosphorylametastasis. They discovered that heat-shock protein tion, HeLa cells for acetyla(hsp) 90α is expressed on the cell surface and has a tion, and human embryonic role in the metastasis of two cancer cell lines. kidney 293T cells for protein The researchers screened two different pools of degradation by ubiquitination. antibodies to find those that bind to surface proteins The antibody arrays could on a specific cancer cell line. A method called fluoroprofile each PTM with greater phore-assisted light inactivation (FALI) was used to detection sensitivity than damage proteins bound by each antibody. When conventional assays. The hsp90α was inactivated in both screens, metastasis investigators are now working decreased, which suggests that the protein is into normalize the different volved in tumor spreading. Because hsp90α damage antigen–antibody binding also decreased spreading in a different cell line, the affinities so that the antibody protein may be associated with several types of array can be used for quanticancers. Whereas hsp90α was implicated in this tative analyses. (Mol. Cell. process, removal of the hsp90β isoform by FALI did Proteomics 2004, 10.1074/ not have an effect. (Nat. Cell Biol. 2004, 6, 507–514) mcp.M300130–MCP200)
Journal of Proteome Research • Vol. 3, No. 4, 2004
currents
TOOLbox liter of fluid can hold almost a trillion of Molecular computer controls gene these computers, making it the tiniest expression computer present today. Ehud Shapiro and colleagues at the As proof of principle, Shapiro and Weizmann Institute of Science in Israel colleagues programmed their computer have designed an autonomous molecuto recognize mRNA lar computer that levels associated can logically anawith small-cell lung lyze and respond to cancer and prostate biological signals. Input Output Computation cancer. They demThe device may onstrated that when have applications in the mRNA levels biochemical sensmatched those ing, genetic engiseen in the canneering, and medMolecular computer. The world’s tiniest comcers, the computer ical diagnosis and puter analyzes incoming mRNA signals and produced a ssDNA treatment. generates ssDNA in response to the signals. (Adapted with permission. Copyright 2004 molecule that mimThe computer Macmillan Publishers Ltd.) icked an anticancer consists of three drug. parts: an input, a Because each molecular computer computation module, and an output. performs its diagnosis independently, Specific messenger RNA (mRNA) levels Shapiro and colleagues envision a large are used as input. The computation number of molecular computers hanmodule, made up of DNA and DNAdling multiple tasks in the same environinteracting proteins, analyzes the mRNA ment. The investigators also hope to levels and generates a “yes” or “no” outtake these molecular computers a step put. The controlled release of short, sinfurther and make them operational in gle-stranded DNA (ssDNA) as output vivo. (Nature 2004, 429, 423–429) counters the mRNA signal. One micro-
Aptamer taste chip Andrew Ellington and colleagues at the University of Texas, Austin, have adapted the “electronic taste chip” with aptamers. The new chip can be used to screen aptamer libraries and to detect and quantify proteins. The first electronic taste chips contained arrays of beads in micromachined wells, rather like taste buds on a tongue. The beads were usually modified with antibodies to detect different ligands. Flow cells delivered samples to the top of the wells, and the ligands flowed past the beads and out of the well through a smaller opening at the bottom of the well. Ligands recognized by the receptors were retained on the beads and were detected by fluorescence. Instead of using antibodies, Ellington and colleagues devised a simple method for attaching aptamers to beads. Aptamers are short nucleotide sequences that can bind to a variety of compounds such as proteins, DNA, and whole cells. The investigators synthesized aptamers with biotin labels and incubated them with streptavidin–agarose beads. The interaction between streptavidin and biotin permitted the immobilization of aptamers to the beads in a parallel array format. The aptamer beads were then placed in the micromachined wells.
(a)
Streptavidin Protein labeled with fluorophore
Biotin
(b)
Anti-protein aptamer
Streptavidin Antibody labeled with fluorophore
Protein Biotin
Anti-protein aptamer
Aptamers on beads. A biotin label on an aptamer interacts with streptavidin on an agarose bead. A protein recognized by the aptamer is either (a) tagged with a fluorescent marker for detection or (b) detected with a fluorescently labeled antibody.
As a proof of concept, Ellington and colleagues demonstrated that the chips could be used as screens to find functional aptamers against lysozyme. The investigators also showed that the chips could be used for protein identification and quantitative measurements of the bioterrorism agent ricin. The investigators compared the binding of ricin to the aptamer chips with the bind-
Quantifying phosphorylations Utpal Tatu and colleagues at the Indian Institute of Science have developed a new tool for quantifying posttranslational modifications (PTMs) in proteins. Called ProteoMod, the algorithm estimates the number of phosphorylation events on the basis of shifts in the isoelectric points (pI) of proteins. In the first step, shifts in pI due to a specific PTM are estimated using 2-D gel electrophoresis. The extent of the shift upon post-translational change is then converted to the number of phosphorylations. The approach is rapid and does not require purification of the protein. According to the researchers, a similar approach could also be used to estimate methylation, acetylation, and sialylation modifications. (Proteomics 2004, 4, 1672–1683)
Estimating protein abundance In shotgun proteomics experiments, acquisition of MS/MS spectra is often biased against low-abundance proteins, such that peptide ions from more abundant proteins are selected more frequently. To determine whether the spectral sampling process truly does reflect protein abundance, John Yates and co-workers at the Scripps Research Institute and Diversa, Inc., developed a statistical model based on their acquisition of MS/MS spectra from a complex protein mixture. The model takes into account the randomness of data acquisition and accurately predicts the level of sampling expected for various protein mixtures. Although the model predicts that more abundant proteins will be sampled more frequently, it also predicts that greater coverage of less abundant proteins can be obtained by performing additional experiments on a sample. For a yeastsoluble cell lysate, the model estimates that 10 analyses are needed to identify 95% of the proteins. The researchers propose that the spectral sampling of a protein can be used as a measure of its relative abundance in a mixture without relying on isotope labeling. (Anal. Chem. 2004, 76, 4193–4201)
Journal of Proteome Research • Vol. 3, No. 4, 2004
689
currents
TOOLbox Metabonomic NMR data sets Radka Stoyanova and colleagues at the Fox Chase Cancer Center, Imperial College London (U.K.), and Columbia University have developed a new approach for identifying subsets of patterns in metabonomic NMR data sets. The intensities of these patterns are related to biological effects, such as disease onset and recovery, and thus serve as a new kind of biomarker. The new spectral processing approach, which relies on a statistical method called Bayesian spectral decomposition (BSD), identifies NMR patterns that have a direct metabolic interpretation. The method can be applied to the original spectral frequency data without the need for aggregation of the data into individual “bins”. The researchers demonstrate the BSD approach on 1H NMR spectra of urine from rats given various doses of the liver toxin hydrazine over a period of 150 h. The resulting patterns were related to the dose and time of hydrazine administration. (Anal. Chem. 2004, 76, 3666–3674
International Protein Index Protein data obtained by different methods are stored in numerous databases. In an attempt to centralize this information into a single resource, researchers led by Paul Kersey at the European Bioinformatics Institute (U.K.) have created a revised version of the International Protein Index (IPI). IPI, which now offers complete, non-redundant data sets for the human, mouse, and rat proteomes, was originally used in the analysis of the human genome sequence in 2001. Since then, it has been revised and updated monthly. IPI includes protein data from the Swiss-Prot, TrEMBL, Ensembl, and RefSeq databases, and it provides cross-references between data sources. To learn more about how IPI can help with protein identification, go to www.ebi.ac.uk/IPI. (Proteomics 2004, 4, 1985–1988)
690
ing to conventional antibody electronic taste chips and found that the two chips had comparable detection sensitivities. However, because it was reuseable, the aptamer chip had a significant advantage over the antibodybased chip. Aptamers, unlike antibodies, can be repeatedly denatured and refolded without any loss in activity. (Anal. Chem. 2004, 4066–4075)
TPP derivatives for protein detection Andrew Hamilton and colleagues at Yale University have created a library of tetraphenylporphyrin (TPP) derivatives that bind to a variety of proteins. The TPP derivatives Quenched by proteins. can be used for An array of eight TPP high-throughput derivatives (A–H) showed proteomics, medchanges in fluorescence ical diagnostics, when incubated with and bioterrorism (1) a buffer control and applications, in (2–5) four different which multiple proteins. types of proteins need to be rapidly detected. TPP is a synthetic chemical with a large hydrophobic surface that can interact with the hydrophobic surfaces of proteins. To expand the range of proteins recognized by TPP, the periphery molecule can be derivatized with side groups. Derivatives of TPP are highly fluorescent, but the fluorescence can be quenched by certain proteins. Hamilton and colleagues reasoned that a large number of TPP derivatives in an array format would have many different binding characteristics. The array could respond to a mixture of proteins with various surface
characteristics. Because the fluorescence quenching of the TPP derivatives depends on the ability of the protein to form complexes, quenching can provide information about the protein surface characteristics. The investigators synthesized a library of TPP derivatives that contained either charged or hydrophobic side-groups. They chose eight TPP derivatives out of the library to form a test array and then selected four proteins with very different surface properties, which ranged from acidic to alkaline. When the four proteins were incubated with the test array, the array showed a distinctive pattern of fluorescence and quenching of the TPP derivatives. Quenching of certain derivatives by the proteins indicated that the proteins could form complexes with the TPP molecule. The investigators thus demonstrated that an array of TPP derivatives could provide information about the individual protein surfaces. (J. Am. Chem. Soc. 2004, 126, 5656–5657)
Nanoparticle size matters Devices such as sensors, quantum dots, and “smart” materials are currently being developed with conjugated proteins to provide biofunctionality. However, Jonathan Dordick and colleagues at the Rensselaer Polytechnic Institute warn that the dimensions of the device can inadvertently change the properties of the attached proteins. Dordick and colleagues systematically analyzed the effect of a nanoparticle’s size, independent of its surface chemistry, on the properties of adsorbed proteins. Using circular dichroism spectroscopy and colorimetric enzymatic assays, they found that an enzyme lost more of its α-helical content, and hence its activity, when adsorbed to larger nanoparticles than to smaller ones under the same experimental conditions. (Langmuir 2004, doi 10.1021/la0497200)
NMR probe for proteins Thanks to a specially designed microcoil NMR probe, Wolfgang Peti and colleagues at the Scripps Research Institute, MRM Corp., and Sequoia Sciences have found a way to perform NMR spectroscopy on microgram amounts of proteins. The probe reduces the amount of protein required for NMR, and it allows for the complete assignment of all of the amino acid side chains from a single experiment. The new probe, called the CapNMR, is commercially available from MRM Corp. Because of its solenoid coil design, the probe has excellent radio-frequency properties, according to the researchers. As a result, it provides information about aromatic amino acid side chains, including those connected to aliphatic side chains. Those data cannot be obtained using traditional 5-mm room-temperature probes or cryoprobes. The approach could speed up the assignment of aromatic side chains in proteins. (J. Am. Chem. Soc. 2004, 126, 5873–5878)
Journal of Proteome Research • Vol. 3, No. 4, 2004