REVIEW pubs.acs.org/ac
Analytical Aspects of Proteomics: 20092010 Zhibin Ning,†,‡ Hu Zhou,†,‡,§ Fangjun Wang,^ Mohamed Abu-Farha,†,‡ and Daniel Figeys*,†,‡ †
Ottawa Institute of Systems Biology (OISB) and ‡Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Canada K1H 8M5 § Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China 201203 ^ Key Lab of Separation Sciences for Analytical Chemistry, National Chromatographic Research and Analysis Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, China 116023
’ CONTENTS Sample Preparation General Improvement Proteomic Reactor/Online Sample Processing Improvements of Enzyme Immobilization Combination with Protein and Peptide Separation/Fractionation Multiple Enzymes Coupling with Other Techniques Applications of Proteomic Reactor Phospho-Peptide Enrichment Immobilized Metal Ion Affinity Chromatography Metal Oxide Affinity Chromatography (MOAC) Other Phosphopeptide Enrichment Methods Glycopeptide Enrichment Mesoporous Silica Materials Mass Spectrometry Analyses Parallel Ion Collision High Collision Dissociation Electron Capture Dissociation/Electron Transfer Dissociation Quantitation Strategies Chemical Labeling Dimethylation Isobaric Mass Tagging 18 O Labeling New Labeling Reagents and Strategies Metabolic Labeling Selected Reaction Monitoring/Multiple Reaction Monitoring Data Analysis General Database Searches Specific Database Searching Strategy Cross-Linked Peptide Identification Optimization for ECD/ETD Database Search for “Chimera” Spectra Quantitation Information Extraction r 2011 American Chemical Society
4408 4408 4408 4408 4409 4409 4409 4409 4409 4409 4409 4410 4410 4410 4411 4412 4412 4412 4413 4413 4413 4413 4414 4414 4414 4414 4415 4415 4416 4416 4416 4416 4416
Proteomics Databases Read out More from MS Proteomic Applications Proteome Profiling Biomarker Discovery Functional Proteomics Chemical Proteomics Chemical Proteomics in Biology Chemical Proteomics in Drug Discovery Top-down Proteomics ProteinProtein Interaction Large Scale Protein Interaction Networks Quantitative Protein Interaction Studies Perspectives Author Information Biographies Acknowledgment References
4417 4417 4417 4418 4418 4418 4419 4419 4420 4420 4420 4421 4421 4421 4422 4422 4422 4422
T
he term proteome was introduced over 17 years ago by Marc Wilkins, and since then, over 20 000 articles have been published on proteomics/proteome. The field of proteomics is still driven by the development of new technologies for proteome processing, proteome labeling and quantitation, new mass spectrometers, and bioinformatic tools. Moreover, the number of biological applications of proteomics has also drastically increased. The highlights since our last Review have been the development of new technologies, the expansion of quantitative proteomics, the mapping of post-translational modifications, the mapping of proteinprotein interactions, and the further development of chemical biology. Some of the proteomic projects we previously reported have progressed rapidly. For example, the Human Protein Atlas, which employs antibody based proteomics to study the proteome, recently reached the milestone of having antibodies for ∼50% of the human protein-coding genes. Furthermore, the repository of proteomic data sets has also
Special Issue: Fundamental and Applied Reviews in Analytical Chemistry Published: April 14, 2011 4407
dx.doi.org/10.1021/ac200857t | Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry rapidly evolved. This Review highlights the improvements of the whole proteomics workflow over the last two years, including the technical and conceptual innovations, as well as typical applications. We also discuss the concerns and trend of the current proteomics.
’ SAMPLE PREPARATION Sample preparation remains an essential and often troublesome aspect of protein identification and quantitation in bottomup proteomics. Generally, proteins need to be extracted from cells, tissues, or other sources and then cut into peptides for mass spectrometric analysis. The sample preparation directly affects the number of proteins that can be identified and also affects the results of protein quantitation. Here, we will discuss some of the experimental protocols to shorten digestion time, to improve digestion efficiency, and to enrich a particular class of peptides, as well as to integrate sample processing workflow. General Improvement. One of the key steps for the preparation of proteomic samples is protease digestion. The classical method includes in-gel or in-solution digestion. Generally, proteins are denatured, and then, the cysteine residues are reduced and blocked to eliminate disulfide bonds. There are some recently reported innovations dealing with the overall sample preparation and digestion procedures. For example, Wisniewski1 reported the filter-aided sample preparation (FASP) method for the processing of proteomic samples. It is described as a universal sample preparation method for proteome analysis. The core innovation of FASP resides in the use of the same filter to remove detergent, to exchange buffers, and to filter the undigested proteins as well as trypsin. This method combines the advantages of in-gel and in-solution digestion for proteomic identification. It can be used for samples with a high concentration of detergent without excessive dilution and thus is very beneficial for membrane protein digestion.2 FASP can also handle milligram amounts of protein. For membrane proteins, Duan et al. developed a precipitation and on-pellet digestion procedure.3 The protein sample is precipitated first and then digested directly on the pellet. Cysteine is then reduced and blocked, and another round of trypsin digestion is carried out. This method shows higher sequence coverage and reproducibility than the canonical in-gel digestion. Actually, this method not only is specific for membrane proteins but also can be adapted for a broad range of samples, including tissues and cell lines. In addition, this method is amenable to further modification. For example, with the introduction of reduction and alkylation prior to protein precipitation, only one step digestion is required prior to the subsequent identification step. A single-tube sample preparation method was proposed by Kadiyala et al.4 They used volatile perfluorooctanoic acid (PFOA) to replace the already commonly used nonvolatile surfactants, such as sodium dodecyl sulfate (SDS). As well, volatile triethyl phosphine can be used for the cysteine reduction. The replacement of surfactants and other chemicals with volatile compounds results in a single-tube sample preparation for the subsequent MS analysis. This method is especially useful for limited sample, as it does not employ cleanup devices and membrane filters, minimizing sample loss. For the digestion procedure, a few tricks are introduced to accelerate the digestion rate and improve the digestion efficiency. For example, a syringe is used to increase the pressure during digestion,5 which results in better digestion efficiency than digestion under atmospheric
REVIEW
pressure. A similar effect can be observed by infrared-assisted6 and microwave-assisted7 proteolysis. Proteomic Reactor/Online Sample Processing. Historically, proteomic workflows have been driven by improvement in HPLC and mass spectrometry analysis of peptides and have often paid very little attention to the capture and processing of proteomic samples. The development of increasingly complex “gel free” proteomics and the need to improve sequence coverage and the number of protein identified/quantified has led to more research on methods to capture and process the proteomes from various samples. Will we have one day a robust and sensitive proteomic system in which cells or tissues are introduced and the subsequent mass spectrometric analytical steps are automated? The development of proteomic reactors might represent the core of such systems. The proteomic reactor is a microfluidic device that was originally designed to handle a small amount of sample, especially for protein digestion. Almost all of the pre-MS analysis procedures can be done on the reactor, including protein concentration, desalting, buffer exchange, reduction, and alkylation, as well as digestion, resulting in minimized sample loss and increased sensitivity. Two major strategies were applied for the design of proteomic reactors. The first strategy called immobilized enzyme reactor (IMER) is based on the immobilization of enzyme in a small reactor whereas the second strategy is based on the capture of proteins and enzymes on an ion-exchange resin in a small reactor. The recent improvements of proteomic reactors and their applications are highlighted below. Improvements of Enzyme Immobilization. One of the challenges in IMER has been the proper immobilization of enzymes while maintaining their activity and specificity. This is definitely an area of proteomic sample preparation that would benefit from further improvements. Alternative approaches of immobilization have been pursued in the past few years. Yamaguchi et al. demonstrated a protease-immobilized microreactor using biofunctional cross-linker agents, such as paraformaldehyde and glutaraldehyde, to couple trypsin and chymostrypsin to a poly tetrafluoroethylene (PTFE) microtube.8 They obtained promising results using standard proteins, such as a similar number of identified peptides and Km as in solution digestion. However, the performances for more complex samples and at low protein concentration were not assessed. Spross et al. described a capillary trypsin IMER by immobilizing trypsin onto a poly (glycidyl methacrylate-co-acrylamide-co-ethylene glycol dimethycrylate) monolith using the glutaraldehyde technique.9 There again good performances were obtained using protein standards even when one protein was present at 1000-fold higher than the other proteins in a simple protein mixture. Liu et al. reported the infrared-assisted proteolysis using an inflation bulb-driven microfluidic reactor, which consisted of an inflation bulb-driving system, a simple cross-polymethyl methacrylate (PMMA) microchip, and a temperature-controllable IR radiation system.10 Ma et al. described a metal-ion chelate immobilized enzyme reactor supported on organicinorganic hybrid silica monolith that can be regenerated.11 The monolith silica capillary was prepared in a fused silica capillary via the polycondensation between tetraethoxysilane hydrolytic sol and iminodiacetic acid conjugated glycidoxypropyltrimethoxysilane, so Cu2þ and ethylenediaminetetraacetic acid (EDTA) can be used for trypsin immobilization or removal. There remain important factors that will need to be assessed for all of these IMERs prior to their incorporation in proteomic workflows. First, most of these reactors have not been tested or only tested with a few biological 4408
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry samples. The assessment of their performance with biological samples is important. Second, the ease of use of the IMERs will be an important determinant in their widespread acceptance. Combination with Protein and Peptide Separation/Fractionation. Already, some IMER have been incorporated in proteomic workflows. For example, Yuan et al. established an integrated proteomic analysis platform combining protein separation by mixed weak anion and weak cation exchange (WAX/WCX) or size exclusion chromatography, online digestion by a trypsin immobilized microenzymatic reactor (IMER), and identification by microreversed phase liquid chromatography (μRPLC)electrospray ionization (ESI)-MS2.12 Although experiments were performed using complex biological samples, the number of proteins identified remains limited due to the sensitivity of the mass spectrometer utilized (LCQDuo). As well, Sun et al. demonstrated an integrated device for online sample buffer exchange, protein enrichment, and digestion using a membrane interface and monolithic hybrid silica based IMER.13 Percy et al. reported Rheostatic control of tryptic digestion in an immobilized enzyme reactor, by changing the solvent composition (045% acetonitrile) to regulate the fragment length of the tryptic digest products from full digestion (zero missed cleavages) to nodigestion (intact protein).14 Finally, Zhou et al. reported that the combination of proteomic reactor (strong anion exchange and strong cation exchange) with step pH elution can facilitate the identification of low-abundant proteins.15 Multiple Enzymes. Not all proteomic experiments are about getting the highest number of proteins identified. Instead, in many instances, having the highest sequence coverage is also important. This is particularly the case when we study protein complexes, pathways, and their regulation. Lin et al. reported polymer based monolithic enzyme reactors by fabricating a porous methacrylate base monolith followed by photografting with glycidyl methacrylate and immobilization of the enzymes(trypsin and Glu-C) with carbonyldiimidazole, which can be used for online protein digestion.16 Zhou et al. demonstrated that strong anion exchange beads can also be used in proteomic reactor and multiple enzymes (trypsin, chymotrypsin, and GluC) can enhance protein identification and sequence coverage.17 Ma et al. reported an integral membrane protein analysis method by coupling formic acid assisted solubilization and pepsinIMER.18 There is still work to be done to expand the number of enzymes that can be incorporated into IMER and ion exchange proteomic reactors. Coupling with Other Techniques. Zeisbergerova et al. integrated an electrophoretically mediated microanalysis (EMMA) approach and online tryptic digestion for proteomic analysis.19 Liuni et al. introduced a microfluidic reactor for rapid, efficient proteolysis with on-chip ESI-MS analysis; the on-chip digestion was performed on a wide (1.5 cm), shallow (10 mm) reactor “well” that is functionalized with pepsinagarose.20 PereiraMedrano et al. demonstrated a novel glass/PDMS microimmobilized enzyme reactor (μIMER) with enzymes covalently immobilized onto poly(acrylic acid) plasma-modified surfaces for high-throughput membrane proteomics.21 Applications of Proteomic Reactor. The important test for IMER and ion exchange based proteomic reactors is their use in proteomic workflows for the analysis of biological samples. Zhou et al. analyzed the subcellular phosphoproteome using a novel phosphoproteomic reactor that combined a strong cation exchange proteomic reactor and phosphopeptide enrichment by Ti-IMAC.22 They were able to identify 1141 unique phospho-
REVIEW
peptides from subcellular fractions from HuH7 cells. Tian et al. reported a rare cell proteomic reactor based on monolithic SCX matrix that was successful in efficiently quantitating and identifying 2200 proteins from only 50 000 human embryonic stem cells.23 Although the number of applications of proteomic reactors for complex biological samples is still limited, it appears that in some instances these devices provide comparable or even superior performance to the solution based processing. Moreover, they have the potential to greatly simplify proteomic workflows. Phospho-Peptide Enrichment. Changes in the proteome are not solely reflected through the concentration level of proteins but also through changes in the post-translational modifications (PTMs) of the proteins. Protein phosphorylations have been extensively studied by mass spectrometry, with over four thousand articles indexed in Pubmed. The vast majority of these articles are focused on finding phosphorylation sites on specific proteins under controlled conditions. In contrast, phosphoproteomic analysis, in which the global state of protein phosphorylations is comprehensively assessed, has been developed greatly in shotgun proteome analysis. To date, literally ten thousands of phosphorylation sites have been identified by shotgun phosphoproteomics. However, the function of the vast majority of these phosphorylation sites remains unknown. It might well be that quantitative phosphoproteomics, in which the changes in the phosphoproteomes are assessed under defined conditions, will bridge the gap between the numbers of phosphorylation sites that can be identified and having clues to the processes in which they are involved. However, this will only be useful if a sizable portion of the phosphoproteome can be studied. For this reason, a highly efficient enrichment procedure is usually needed before the comprehensive analysis of phosphopeptides by shotgun proteomics due to the low abundance of most of the phosphoproteins and the poor ionization efficiency of phosphopeptides compared to nonphosphopeptides.24 Immobilized Metal Ion Affinity Chromatography. Immobilized metal ion affinity chromatography (IMAC) is the most widely used method for the enrichment of phosphopeptides and has been applied to a broad range of samples. Recently, very impressive results have been reported using this approach including the cataloging of over 36 00 phosphorylation sites from different tissues of mouse25 and over 2500 phosphorylation sites in human embryonic stem cells.26 IMAC requires the immobilization of a metal ion, and the most widely used metal ion among commercially available IMAC materials has been Fe3þ. It was reported that the presence of acetonitrile could improve the selectivity of IMAC toward phosphopeptides.27 However, in general, most of the research on IMAC technology has focused on the development of new IMAC materials, on the use of different metal ions and chromatography adsorbents. Zou and co-workers developed a phosphonate group containing amorphous polymer,22 highly ordered mesoporous silica particles28 and monodisperse poly(GMA-co-TMPTMA-PO3H2) microspheres modified with phosphonate groups,29 coupled with Zr4þ or Ti4þ to get IMAC materials with enhanced selectivity and efficiency for enriching phosphopeptides. Roughly 10 000 phosphorylation sites in human liver were obtained using their new Ti4þ-IMAC materials.30 Metal Oxide Affinity Chromatography (MOAC). Metal oxides such as ZrO2 and TiO2 have also been used for phosphopeptide enrichment. In particular, TiO2 has been most widely applied for the analysis of different biological samples, for example, 4409
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry epididymal sperm maturation,31 mouse spleen signaling responses to anthrax,32 and hepatocellular carcinoma biomarker discovery.33 Due to the good specificity of phosphopeptide enrichment and simplicity of operation, TiO2 meets the basic requirements for most researchers. Further investigation demonstrated that the optimization of the peptide-to-TiO2 beads ratio used during the enrichment of phosphopeptides can increase the enrichment efficiency and specificity.34 The research on improving MOAC was primarily focused on increasing the surface area of the metal oxide material using nanomaterials35 and mesoporous materials.36 Interestingly, mesoporous ZrO2 and HfO2 were shown to be superior to TiO2 for phosphopeptide enrichment of a complex mixture with high specificity (>99%), and this high “purification” efficiency is mainly due to the extremely large active surface area of mesoporous nanomaterials.37 Other Phosphopeptide Enrichment Methods. Hydroxyapatite (HAP), the naturally occurring mineral of calcium apatite, was reported to selectively separate and fractionate phosphopeptides due to the strong interaction between Ca2þ ions and the phosphate groups.38 Anion exchange chromatography can also be used for online enrichment of phosphopeptides by different pH elution39 or in combination with TiO2 enrichment of the flow-through phosphopeptides.40 Most of the phosphorylation sites identified by mass spectrometry using either IMAC or MOAC are on serine and threonine residues. The analysis of the less common tyrosine phosphorylation requires specific enrichment steps such as using antiphosphotyrosine antibodies.41 It should be noted that multiphosphopeptides can be problematic for purification. Interestingly, it was found that there is an optimal peptide-to-TiO2 ratio for each specific sample for phosphopeptide enrichment by TiO2,34 and more multiphosphopeptides can be identified when the amount of TiO2 beads is lowered. Many combinations of different techniques have been proposed to enhance phosphopeptide enrichment, such as reversed phase-TiO2-reversed phase trap column online enrichment,42 polymer packed22 or monolithic43 microfluidic phosphopeptides reactors, and TiO2 coated plates for MALDI-TOF analyses.44 In these new technologies, the phosphopeptide detection sensitivity and reproducibility is greatly improved due to minimization of the enrichment equipment and due to fewer manual interventions, which is extremely important for quantitative analysis of phosphopeptides in different samples. However, even the most recent enrichment techniques coupled to nano-RPLC-MS2 are insufficient to provide a comprehensive analysis of the phosphoproteome. These techniques need to be coupled with comprehensive prefractionation before or after the phosphopeptide enrichment in order to increase the number of phosphopeptides identified.29b The most common prefractionation strategies include strong cation exchange chromatography,25 strong anion exchange chromatography,29b,39 hydrophilic interaction chromatography (HILIC),45 and high pH reverse phase chromatography.30 It is still unclear if the number of phosphorylation sites that can be identified and quantified per experiment is sufficient to provide meaningful insight on the modulation of biological processes. Therefore, continuous research on new enrichment methods in combination with fractionation and mass spectrometry is still needed. Moreover, the number of protein phosphorylations that are identified is outpacing our ability to assess their functions. Therefore, new methods are needed to rapidly assess the role of known phosphorylation sites in biological processes.
REVIEW
Glycopeptide Enrichment. Protein glycosylation is also a very common PTM. To date, the large-scale analysis of protein glycosylation has mainly focused on the N-linked glycosylation. The stoichiometry of protein glycosylation is also relatively low, and therefore, enrichment techniques are needed. Wollscheid et al. developed a surfacecapturing (CSC) technology that labels, with an affinity reactive tag, oxidized carbohydratecontaining proteins present on cell surfaces, followed by digestion and enrichment using the reactive tag of the glycosylated peptides.46 They applied this method to analyze the cell surface N-glycoproteomes of T and B cells. The CSC strategy only labeled the cell surface proteins, which could be used to filter the list of proteins identified, and resulted in the identification of over 100 glycosylated peptides. The CSC strategy was also applied to mouse myoblast C2C12 cell line, and over 200 glycosylation sites were reported.47 Chen et al. used multiple enzyme digestion steps and hydrazide chemistry for protein glycosylation analysis.48 Using this approach, a total of 939 N-glycosites were identified. It is interesting that Lee et al. found that lectin affinity and hydrazine chemistry could capture different glycosylated membrane proteins after phase partitioning of TritonX-114.49 Lectins could enrich high molecular weight membrane proteins with more potential glycosylation sites, whereas hydrazine chemistry isolated a higher proportion of smaller, enzymatic, and peripheral membrane proteins. Finally, activated graphitized carbon was compared to the C18 medium for the glycopeptide separation.50 The activated graphitized carbon is more suitable for the hydrophilic glycopeptide retention. Clearly, our ability to map glycosylation sites has significantly progressed in recent years due to technological developments. However, the elucidation of the structure of the glycan attached to these sites is still limited, and the understanding of the biological significance of all these glycosylation sites remains elusive. Mesoporous Silica Materials. Mesoporous materials are attractive for the selective enrichment of low molecular weight (LMW) proteins or peptides from a complex biological matrix. Mesoporous silica materials have large pore surface area, highly ordered pore structure with uniform mesopores, and relatively good chemical and mechanical stability, such as MCM-41 with 210 nm pore size. Due to the highly ordered pore size, mesoporous silica materials can selectively extract LMW proteins or peptides from complex matrix. Recently, mesoporous silica materials modified with SCX and SAX groups51 and biocompatible alkyl-diol groups52 were used to get ultrahigh extraction efficiency, which is much better than traditional ultrafiltering approaches. The process of LMW proteins extraction by mesoporous material is just like a dynamic ultrafiltration. Furthermore, functionalized mesoporous materials can also be applied for protein PTM analyses, (phosphorylation, for example28,53) and used as enzyme nanoreactor for LMW proteome analysis.54 It was also reported that endogenous peptides from plasma or urine enriched on mesoporous materials could be directly detected by MALDI-TOF55 to improve the analysis throughput in diagnostic peptidomics. Other developments include fabricating mesoporous materials on film,56 chip,57 and magnetic microspheres58 for LMW proteins and peptide analyses. As an efficient dynamic filtering and enrichment technology, mesoporous materials are very useful for LMW enrichment from a complex matrix, such as plasma, where the traditional ultrafiltering is less efficient, and the manipulation is much easier with a normal centrifugator. The mesoporous material can also be used as an ultrahigh efficient reactor for proteins or peptide digestion and phosphopeptide 4410
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry
REVIEW
Figure 1. Scheme for the current data acquisition mode in bottom-up proteomics. The classical proteomics experiment is based on data dependent acquisition mode, in which the MS2 spectra are acquired according to the intensity rank of their precursors in the MS1 scan. In the parallel ion collision proteomic experiment, ions are instead fragmented simultaneously without any gas separation after the MS1; then, the MS2 spectra are reconstructed and assigned to the corresponding precursor ions based on the elution profile alignment. In the directed MS mode, two consecutive experiments are required. The first is run with MS1 only to construct the elution profile and detect the different features, whereas the second run with MS2 only is directed and scheduled by the first run. The DDA mode is the current dominant proteomic experiment.
enrichment. The emerging research directions for mesoporous materials are the chemical surface modification to fulfill different analytical requirements and the development of new types of materials with higher selectivity and efficiency for LMW analysis. Due to their high surface area and ease of surface chemical modification, mesoporous silica materials are promising for biological analysis.
’ MASS SPECTROMETRY ANALYSES Mass spectrometry has undoubtedly contributed to the rapid development of proteomics. In contrast to many analytical techniques that have stagnated, the development of new mass spectrometers and techniques based on mass spectrometry has remained strong. It is important to point out that one of the major drawbacks of current mass spectrometers is poor software and instrument controls that limit on-the-fly data analysis and the design of more data driven proteomic experiments. This has been an ongoing criticism of mass spectrometers for as long as we have been in the field of proteomics. We are still basically doing the same data dependent MS2 experiments that were performed 10 years ago. It is disconcerting that, in the days of cheap and powerful computer chips, we still have very limited capability directly associated with mass spectrometers to do more extensive data dependent experiments. In typical proteomic experiments,
many different m/z features are detected by the mass spectrometer in MS scan overtime. The vast majority of these features are likely peptides. Unfortunately, MS2 spectra are only generated for a very small subset of these features. This is generally automatically done by selecting a few features per MS scan for further MS2 analysis. This is often called data dependent acquisition (DDA) mode, and the selection process is generally driven by the feature intensity (Figure 1, left panel). The most obvious limitation of this approach is the repetitive selection of the high abundance peptides which is a waste of the limited analytical time, although it is reduced by dynamic exclusion. In addition, the selection usually does not take place at the apex of chromatographic peak, which is especially an issue for lower abundance peptides. Furthermore, for complex samples, too many features are simultaneously present which renders the DDA selection almost random, and therefore, the reproducibly of the results is reduced. Unfortunately, DDA has not kept up with the evolution of proteomic experiments. For example, it is not possible to select isotopic pair on the fly and to trigger DDA based on their ratios. This means that in quantitative proteomics most of the MS2 are generated for peptides that are not changing and usually not interesting. In the meantime, Schmidt et al. coined the concept of directed mass spectrometry, which is a data acquisition mode derived from multiple reaction monitoring.59 In the directed mode, the 4411
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry selection of the precursor ions is “directed”, rather than the DDA approach (as shown in Figure 1, right panel). Specifically, the directed mode is done in two steps including peak detection and sequencing of the selected precursor ions. The second step is directed by the first one, and then, the acquired MS2 spectra are mapped back to the peaks. The main advantage of the directed mode is the efficient management of the analysis time, which results in nonredundant MS2 and therefore more in-depth selections and analysis of low abundant peptides. They reckoned that it plays an important role in the hypothesis driven proteomics.60 The directed mass spectrometry acquisition mode could probably be useful to many researchers if the required tools are well integrated into the corresponding platforms. The solution to these issues might be for the MS vendors to fully open their control codes. Here, we will describe some of the newer developments in the field of MS in the context of proteomics. Parallel Ion Collision. The most commonly used data dependent acquisition (DDA) mode for most of the protein identification analysis is to some extent a compromise of the MS scan rate to the limited chromatographic peptide peak elution time. Many peptides are ignored in the presence of high abundance peptides due to their relatively low intensity and short elution time. It is surely another important bottleneck for current bottom-up proteomics. A data independent mode, in which ions are fragmented in parallel, provides an alternative for protein identification besides the conventional serial data dependent mode (as shown in Figure 1, middle panel). Different laboratories have their own nominations for this mode, e.g., shotgun collision induced dissociation (CID),61 MS(E) (a mode of elevatedenergy MS),62 simultaneous peptide fragmentation,63 all ions fragmentation,64 or parallel fragmentations of peptides.65 This parallel ion collision mode does not require the ion isolation, and only a low or high energy scan is performed. Therefore, the coeluting ions at a defined time window are all fragmented together, and the duty cycle is substantially reduced. Then, a deconvolution algorithm is necessary to recognize peptide chromatographic peaks and the corresponding MS2 spectrum for a specific parent ion by accurate retention profiles of the fragments. The parallel ion collision mode can also be used for quantitation, even absolute quantitation.66 Technically, the quantitation accuracy should be higher than conventional DDA due to better reconstructed retention profiles induced by its shorter duty cycle and more MS1 scans when peak intensity is utilized for protein quantitation. It was reported that the time-aligned data, from data independent LC-MS experiments, is highly comparable to the data obtained via a data dependent experiment.67 This information can therefore be effectively and correctly deconvoluted to correlate product ions with parent precursor ions. A novel database search algorithm for this kind of data independent acquisitions was also developed.68 The algorithm has the ability to correctly identify peptides with high sensitivity and specificity from the data independent acquisition, and their results also showed that the parallel acquisition mode could identify 20% more proteins than the traditional mode with the same filtering criteria. In another work, employing this mode on a stand-alone Orbitrap analyzer, 45 of 48 proteins were identified from an equimolar protein standard mixture.64 The key of this mode is to find out the lineage between the parent ion and its daughter ions. To some extent, the more challenging part lies in improving the accuracy and sensitivity of relative bioinformatics tools, rather
REVIEW
than the MS experiments. With an ideal data process and search engine, the parallel fragmentation mode might become a “Parallel” alternative for the current available methods. High Collision Dissociation. High collision dissociation (HCD) is a technique that can be performed on a few mass spectrometers including the LTQ-Orbitrap. In the LTQOrbitrap, HCD takes place in a collision cell at the far side of the C-trap.69 HCD does not have the low mass cutoff issue observed when performing collision induced dissociation (CID) in ion traps. More importantly, in comparison to CID, HCD fragmentation leads to different and more fragments for peptides, especially phosphopeptides where neutral loss can readily occur in CID. Another incidental benefit is that the fragment can be measured with high accuracy using the orbitrap analyzer. Hyphenated with the high accuracy full MS scan, a “highhigh” mode can be performed, especially on the new Velos Obirtrap platform.70 Nagaraj et al. used a “High-high” mode to do the high resolution MS scan and high resolution HCD MS2 scans for phosphopeptide identification.71 This combination increased the mass accuracy of fragment ions by about 50-fold, compared to the conventional “high-low” strategy, and resulted in the identification of as much as 16 000 total phosphorylation sites within a day of mass spectrometry. HCD has been also successfully applied for lysine ubiquitination identification72 and glyscosylation site assignment,73 as well as quantitative analyses.74 The ease of performing HCD on certain mass spectrometers provides new opportunities for proteomic research. Electron Capture Dissociation/Electron Transfer Dissociation. In proteomics, collision induced or activated dissociation (CID/CAD) has been the predominant mode of peptide fragmentation, primarily because other methods of peptide fragmentations were not readily available and easily integrated in high-throughput proteomic workflows. However, CID can be problematic for the analysis of PTM because some modification groups present on amino acids tend to be dissociated before the backbone’s dissociation. McLafferty and co-workers introduced a new technique called electron capture dissociation (ECD), by which slow electrons are introduced into a mass spectrometer and captured by the trapped multiple charged cations which lead to the fragmentation of the ion.75 The ECD mainly produced c and z series of ions and can produce almost complete sequence coverage of peptides. Another critical advantage of ECD is that the labile PTM groups are usually preserved during the dissociation. However, ECD is limited to FTICR instruments. Subsequently, electron transfer dissociation (ETD) was developed to perform similar fragmentation in ion trap or quadrupole MS analyzer.76 In ETD, the radical electrons are transferred from aromatic anions to the cation of peptides. Hennrich et al. compared the effect of different chemical modifications on ETD.77 They found that the modifications which could increase the charge number like dimethylation, guanidination, and particularly imidazolinylation of doubly charged Lys-N peptides greatly increase peptide sequence coverage by ETD. Furthermore, Vasicek and Brodbelt evaluated the effect of three different alkylation reagents on reduced cysteine: the N,N-dimethyl-2chloro-ethylamine (DML) and (3-acrylamidopropyl)-trimethyl ammonium chloride (APTA) could significantly improve the sequence coverage on ETD because of the addition of positive charges.78 For ETD, the fragmentation efficiency is low for doubly charged peptides, and a large number of ions undergo electron transfer but do not undergo dissociation. Many efforts have been 4412
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry
REVIEW
Figure 2. Frequently used quantitation strategies in proteomics. SRM/MRM plays an important role both in relative and absolute quantitation (with AQUA). The classical nontargeted bottom-up MS is mainly used for relative quantification. Many labeling methods and some data extraction strategies have been developed for labeling and label-free workflows.
done to address this problem. Campbell et al. proposed a simultaneous ETD/CID to combine these two complementary techniques.79 It is based on the phenomenon that the unreacted precursor ions are often abundant after ETD. The simultaneous mode is a supplemental activation; in detail, the selected parent ions are subjected to ETD, while the unreacted ions are then subjected to CID. The simultaneous ETD/CID can produce b and y as well as c and z series in a single MS2 spectrum. Ledvina et al. presented an infrared photon activated-ion electron transfer dissociation (AI-ETD) as an alternative solution.80 They found that the AI-ETD successfully increased the sequence coverage and outperformed unassisted ETD and CID in term of peptide spectral matches. In short, ETD is good choice for PTM research due to some unique advantages such as ease of localization of the PTM sites. However, right now, it cannot be a replacement for CID in high-throughput proteomic experiments due to the incomplete fragmentation.
’ QUANTITATION STRATEGIES The possibility of not only identifying proteins but also quantifying changes in protein levels on a proteome scale has drastically changed the field of proteomics. Although quantitative proteomics provides a static snapshot of the biological state, multiple experiments performed overtime can be stitched together to provide a “motion picture” of a dynamic biological process. Different techniques have been reported to perform quantitative proteomics including no-labeling approaches, chemical and biochemical labeling, and in vitro, in vivo, and ex vivo labeling of the proteome as shown in Figure 2. There is no new earth-shattering quantitation strategies reported in recent years; however, many improvements were reported to existing strategies to make them easier to implement and more accurate. Chemical Labeling. Dimethylation. Isotope dimethyl labeling is one of the most popular in vitro chemical labeling strategies targeting primary amine on N-terminus and lysine side chains. It provides high reaction efficiency, fast kinetics, and low cost. Heck and co-workers standardized a triple isotope dimethyl labeling
strategy using several isotopomers of formaldehyde and cyanoborohydride to label three different samples which can then be combined for mass spectrometric analysis. Moreover, they also performed the labeling in solution, on SPE column, and online with trap column labeling.81 Recently, this labeling strategy was applied for PTM studies.41b,82 Isotope dimethyl labeling is feasible for proteomes obtained from cell lines, tissues, and clinical samples. One developing direction of isotope labeling is combining online labeling with multidimensional separation to increase the detection sensitivity and proteome coverage.83 Isotope dimethyl labeling can be applied in other areas of protein analysis, such as de novo peptide sequencing to recognize specific ion series.84 Overall and co-workers applied isotope dimethyl labeling for N-terminus protection for N-terminus study and protease cleavage site identification.85 Using this approach, they identify 731 acetylated and 132 cyclized N-termini and 288 matrix metalloproteinase (MMP)-2 cleavage sites in mouse fibroblast secretomes by combining N-terminus protection and quantitative analysis.86 Isobaric Mass Tagging. The commercialized isobaric mass tagging reagents are mainly iTRAQ from AB SCIEX and TMT from Thermo Scientific. These reagents are designed for Q-TOF like equipment. The ion trap type mass spectrometry is not suitable for this type of quantitation reagent because of the low mass cutoff in MS2. Luckily, the emergence of dissociation like pulsed-Q dissociation (PQD), HCD, and ECD/ETD is changing the current situation. The main technical progress in this field is the optimization of the dissociation method for labeled peptides. For example, for 8-plex iTRAQ labeling reagent, only five channels could be detected from ETD-only produced MS2 spectra because of the insufficient cleavage of the bond between reporter and balance regions. A combination of ETD and CID was used to get the complete eight channels by Phanstiel et al.87 They applied an additional resonant excitation of the ETD produced peak of 322 m/z, which is the fragment that is composed of the reporter and balance region. Another combination of PQD and ETD is also used for the quantitation of iTRAQ labeled phosphopeptides,88 where ETD plays the role of 4413
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry increasing the identification, while PQD is mainly for generating reporter ions. Similarly, CID and HCD are also used in the above way: CID is in charge of the peptide identification, and only the reporter ions from the HCD spectra are used for quantitation.89 Besides the classical usage for relative quantitation, the isobaric mass tagging reagents could also be used for absolute quantitation. Dayon et al. used TMT labeled synthetic proteotypic peptides for absolute quantitation.90 The TMT labeled peptide can provide an internal calibration curve. Similar quantitation applications was done using iTRAQ with HCD on a LTQOrbitrap for protein phosphorylation.91 As for phosphorylation, Thingholm et al. found that the popular isobaric labeling could significantly reduce the identification efficiency of phosphopeptides because the derivatization leads to more charges on the peptides during the ESI process.92 However, perpendicular ammonia spray to decrease the average charges will rescue the phosphopeptide identification. 18 O Labeling. 18O labeling is a proteolytic labeling method performed in 18O water when doing the enzyme digestion. Theoretically, two heavy oxygen atoms can be incorporated into the carboxyl termini of all tryptic peptides. A key question to be solved for 18O labeling is to calculate the 18O/16O ratios accurately which can be a challenge because of the overlap of the isotopic peak and the incomplete tryptic labeling, as well as the 18O/16O exchange. Several methodologies were employed to address this problem. Petritis et al. presented a method to quench the trypsin activity and therefore stop the 18O labeling backexchange.93 The method is to simply boil the tryptic peptide for 10 min. Monoethanolamine (50 mM) is added into the reaction buffer to reduce the carboxy oxygen exchange.94 Therefore, there is only one 18O atom incorporated into the peptides. The production rate of the single 18O atom peptides was above 85% using this method. Usually, the quantitative information of 18O labeling is extracted from the full mass scans, because the light and heavy peptides coelute. However, the relative abundance can also be extracted from MS2. This approach was shown to have good accuracy, sensitivity, and signal-to-noise.95 A trapezoidal rule was used to integrate peak intensities for peptide ion over the retention time to resolve the problem caused by the heterogeneous and incomplete labeling.96 The algorithm integrates all the peak intensities for the detected isotopic peptides. Qian et al. recently used 18O to label pooled samples as reference standard.97 They used a dual-quantitation mode which combines the labeled and label-free information for protein quantitation. Besides quantitation, 18O labeling was recently used for the identification of protein carbonylation.98 In this work, 18O was introduced and stabilized on the reactive carbonyl modification by eliminating trypsin-catalyzed incorporation at C terminal. The protein carbonylation could be verified by the isotopic pattern. New Labeling Reagents and Strategies. Besides the most popular labeling approaches mentioned above, new reagents and new strategies keep coming out for chemical labeling. A new type of isobaric amine-reactive reagent, called DiART, was synthesized for protein quantitation.99 The overall principle and structure of the reagent is almost the same as the iTRAQ and TMT. The difference is that the DiART uses deuterium instead of 13C and 15N to encode the mass difference. The cost of the reagent is relatively low, and it can be easily prepared. Another similar reagent called mTRAQ was designed especially for multiple reaction monitoring (MRM);100 however, it is also applicable for regular quantitation using MS scans.101 Wang
REVIEW
et al. used acrylamide to label cysteine and succinic anhydride to label lysine.102 This strategy is called dual stable isotope coding (DSIC). The DSIC labeling approach is performed at the protein level and is fairly easy to implement in proteomic processes in the aqueous buffer used after cysteine reduction. It has very high labeling efficiency and therefore increases the yield of quantified proteins. Compared to other commercialized reagent, DSIC is more affordable and is applicable to complex biological sample. Clearly, we have not seen the end of development of new reagents for quantitative proteomics. Metabolic Labeling. Stable isotope labeling by amino acids in cell culture (SILAC) is definitely one of the most successful quantitation strategies so far. It is used to metabolically label protein in cells by replacing some essential amino acid by stable heavy isotopic labeled ones.103 Recently, SILAC has also gained application in yeast104 and bacteria.105 SILAC not only is used to quantitate cell line sample but also can be used as internal standard to quantitate tissue samples. However, the difference of protein abundance and protein species between tissue and cell samples can lead to inaccurate quantitation. To solve this problem, Geiger et al. combined different SILAC labeled cell lines together, called “Super SILAC”, to mimic the tissue sample’s counterpart.106 They found that when only one SILAC labeled cell line was used for the quantitation of tissue sample the ration distribution was broad and bimodal. Whereas when the Super SILAC internal standard was used, a much narrower and unimodal distribution was achieved. When SILAC is used, the arginine to proline conversion is a big issue which affects the quantitation precision. The conversion is metabolic and cell line dependent. Changing the concentration of the corresponding added amino acid may resolve the problem; however, this strategy might change the growth condition. Park et al. proposed a computational method to correct this bias by adding up all the isotopic peaks coming from the same heavy peptides.107 Meanwhile, Bicho et al. proposed a genetic engineering solution to solve this problem in yeast by deleting genes involved in arginine catabolism.108 This is particularly useful because the conversion in yeast is very efficient. Selected Reaction Monitoring/Multiple Reaction Monitoring. The method of selected reaction monitoring or multiple reaction monitoring (SRM/MRM) was originally used for small molecules and more recently in metabolomics109 and proteomics.110 The transition selection of SRM/MRM is time-consuming, and it probably is the biggest obstacle for ordinary practitioners. SRM/MRM requires selection of peptides that are quantifiable surrogates for proteins of interest and also the selection of transitions. Proteotypic peptides or signature peptide that could produce the highest ion-current response most likely provide the best detection sensitivity and consistency with the original protein amount. There are too many factors that affect the final quantitation result, like the peptide length, amino acid composition preference, and the sequence motif; for example, the glycosylation motif should be avoided because of the uncertainty of the molecular weight. Furthermore, the uniqueness and the detectability, as well as the elution time, also should be taken into consideration. Therefore, the manual selection of peptides for SRM/MRM is challenging and time-consuming. Fortunately, bottom-up proteomics has generated a huge amount of information on peptides that are readily observable by mass spectrometry and the associated MS2. These results have been increasingly deposited in public databases. These peptide databases contain abundant information, which is essential 4414
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry for the SRM/MRM experimental design, such as fragment abundance, retention time, and peptide abundance, as well as the peptide uniqueness among proteins. The recently published Global Proteome Machine database (GPMDB) can provide some of the information needed for MRM experiment design based on archived observations from previous experiments.111 Sherwood et al. investigated the correlation of y fragmentation ions derived in ion trap and quadrupole mass spectrometers.112 They found that there is a good correlation between the y-ion intensity rank orders, and therefore, the high responsive y-ions can be directly selected as candidates for SRM/MRM without further optimization. Their findings provide a good perspective to choose transition from previous MS2 identification information. They further developed an application called MaRiMba for spectral library based MRM transition list assembly.113 Prakash et al. focused on all fragment ions, and they developed a scoring algorithm to compare the chromatogram between the SRM and data dependent analysis mode to determine the best transition.114 A prediction method called enhanced signature peptide (ESP) predictor employed random forest classifier and protein physicochemical properties to select high responding peptides. Their method provides a good in silico screening of transition before MRM experiment.115 The ESP predictor outperforms existing methods and could be used to select the best peptides for MRM analysis. Besides the library based method, one can also select transitions from the protein sequence itself. Bertsch et al. developed an algorithm to de novo construct the MRM assays from the sequence of the targeted proteins alone without any transition library and MS2 spectra of peptides.116 The algorithm predicts the proteotypic peptides and retention time, as well as the transitions. It is alleged that more than 80% coverage of the targeted proteins can be achieved by this algorithm without any further optimization. Some earlier computational resources were reviewed and compared by Cham Mead et al.117 Besides progress in bioinformatic data mining, there were also some attempts to make the SRM/MRM assay easier to develop. Picotti et al. reported a high throughput SRM method using crude, unpurified peptide libraries synthesized by spot synthesis. They reported that they can get a throughput of more than 100 peptide SRMs per hour.118 One of the weaknesses of SRM/MRM is that you can only explore a sample with a priori knowledge. A lot of tedious work needs to be done before the practical experiment. Even the most state-of-art prediction algorithm cannot guarantee 100% hits of transition selection. One needs to know the exact m/z of the targets to do the analysis. Therefore, the result could be affected by different PTM state, cleavage forms, mutations, single-nucleotide polymorphisms (SNPs), etc. All this information is usually unknown. As a compromise of the current SRM/MRM and MS2 quantitation method, Baek et al. proposed an alternative called multiple products monitoring (MpM) based on monitoring the majority of product ions obtained in the MS2 scan.119 They used the MS2 spectrum instead of MS1 to reconstitute the chromatographic peaks on ion-trap platform and therefore could avoid the selection of the transition and then got improved sensitivity and selectivity.
’ DATA ANALYSIS The generation of the data is far from being the end of the experiment in proteomics. The raw data sets need to be analyzed using bioinformatic tools. This phase is mainly composed of two
REVIEW
parts, database searches for protein identification and in some cases protein quantitation analysis. General Database Searches. Even though bioinformatic tools for protein identification have been developed for over 15 years, there are still many challenges. In particular, the speed of some search engines is an issue for large-scale analysis, the assignment of the false positive discovery rate, assigning identified peptides to corresponding proteins, and the fact that different bioinformatic tools provide different results. Most of the mainstream databases searching algorithms use the concept of matching the mass spectrometry generated MS2 spectra against the theoretical MS2 spectra predicted from the protein sequences. However, more recently, as the databases of experimental MS2 spectra grew, attempts have been done to use real spectra to replace the theoretical spectra library.120 Yen et al. further used a simulated MS2 library to replace the real spectra.121 The simulated spectrum takes into account the ion intensities as well as specific ion types and therefore is far more “like” the real one. This kind of spectrum to spectrum matching greatly increases the search speed. Another direction to speed up the process is to do peptide and spectrum indexing122 or reorganize the protein sequence, like longest common prefix (ABLCP).123 Prefiltering the MS2 spectra can also accelerate the search speed. Prefilter and postfilter can also increase the reliability of the identification result. Some filtering strategies were reviewed by Salmi et al.124 Another issue of present database searching strategies is the scoring algorithm, which is mainly about how to fully utilize the information provided by the MS2 spectra, the mass accuracy, and the relative ion abundance. Recently, the H-score was introduced to rescore results provided by existing search algorithm.125 The H-score employs the high mass accuracy matching of all the detected fragment ions and therefore proves to be beneficial for protein PTM analysis. Furthermore, it is alleged to have up to 190% improvement over the Mascot scoring scheme for the low abundant spectra. Right now, for a general proteomics research experiment, besides the score that search engine provides, the false positive/discovery rate on peptide or protein level is used to estimate the confidence of the result. Actually, for some specific experiment, other information can also be used to evaluate and validate the searching result, for example, molecular weight and pI distribution, as well as specific labeling. Volchenboum et al. developed a tool called Validator, which uses the isotopic fragmentation pattern of b and y series ions to facilitate confident assignment.126 Search engines usually produce a set of results for each raw experiment. Increasingly, proteomics involves the comparison of multiple experiments and tools are needed to keep track of the large data sets, to integrate the search results, and for further comparative and conclusive analysis. Some software like Prequips,127 MassSieve,128 and ProHits can do such integration.129 The ProHits software is a laboratory information management system (LIMS) which was initially designed to address the need of tracking different information in affinity purification coupled with mass spectrometric identification (AP-MS). This system can be readily adapted for the tracking of different information in proteomic experiments. MassSieve is a tool for the integration of the results following database searches for multiple LC-MS2 experiments and can also perform relative quantitation by spectral counting. Prequips is a software for the postdatabasesearch analysis. It allows the comparison of multiple data sets as well as their analysis using tools such as Cytoscape and databases such as STRING and KEGG. 4415
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry Specific Database Searching Strategy. Besides the general
demand of protein profiling using the regular proteomic workflows, some biological questions need more specialized experimental designs that include steps such as protein cross-linking and protein labeling as well as some unusual MS methods. The results from these experiments often require different bioinformatic tools. Cross-Linked Peptide Identification. Up until recently, there was no search engine capable of handling cross-linked peptide identification. This was a particular issue for the analysis of proteinprotein interactions when chemical cross-linking was required to stabilize the protein complex. Protein cross-linking can occur within a protein and can be used to explore the protein structure.130 However, for the study of proteinprotein interaction, we are interested in the study of the cross-linking that occurs between different proteins within a complex. This situation is complicated, because the number of candidate protein alignment is drastically increased. A new application, called xComb, was developed to construct specialized database.131 The xComb includes all possible cross-linked peptides to create a database that can then be used with any general search engine. McIlwain et al. also described a straightforward database search scheme, in which the searching database is composed of single peptides and peptides with linkers attached, as well as cross-linked pairs, and then uses a SEQUEST-style search algorithm to do the assignment.132 Optimization for ECD/ETD. CID has been the main stream in proteomics, so all of the major search engines were optimized for CID spectra, which are dominated by b and y series ions. However, the MS2 spectra from ECD/ETD are quite different and complementary to the traditional CID. Moreover, often no results are obtained from the “seemingly more informative” ETD spectrum using conventional search engines. ETD of peptides generates completely different types of fragments: c and z series of ions. Zhang proposed an empirical model to predict the ECD/ ETD peptide fragmentation spectra, which can help to better design the fragmentation algorithm.133 Kandasamy et al. evaluated several MS2 search algorithms for analysis of ETD spectra.134 They compared OMSSA, Mascot, Spectrum Mill, and X!Tandem. They found greater differences between algorithms than were previously reported for CID. Moreover, the difference in the number of identified peptides between the best and worst search engine was 70%. In the meantime, some new strategies and even algorithms were developed specifically for searching ETD spectra. Good et al. performed ETD spectral processing to improve peptide identification. They then increased the total search sensitivity by approximately 20% for both human and yeast data sets by removing some ETD-specific features that are often unaccounted for and may hinder identification.135 A new database search algorithm was reported, which is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences.136 Datta and Bern proposed a new concept called Spectrum fusion, to combine the information from several mass spectra of the same peptide. The algorithm can automatically learn the peptide fragmentation patterns and therefore can handle spectra from any instruments and fragmentation techniques which improves de novo sequencing success rates.137 There were also quite a few reports focused on an optimization search algorithm for the ECD/ETD data set, including charge state determination,138 generating function,139 scoring function,140 statistical analysis,141 and site localization of
REVIEW
modification.142 Clearly, the ETD spectral processing and the optimizations of the search algorithm improve the search results. However, the results obtained so far indicate that more research needs to be devoted to the development of algorithm for the analysis of ETD derived MS2 spectra. Database Search for “Chimera” Spectra. Most of the database search engines used in proteomics assume that the fragment ions observed in an MS2 spectrum are all derived from the selected parent ion. Unfortunately, the selection of the precursor ion for fragmentation is often performed with low resolution or with a few amu mass windows to balance sensitivity, isotope transmission, and accuracy. In a complex mixture, this isolation window might well include isobaric or nearly isobaric peptides of different sequences as well as other molecules. The resulting MS2 spectra will not be derived from a single molecule and are also called “chimera” spectra.143 Moreover, even with effective gas phase ion isolation so that a very narrow m/z window of precursor ion is selected, the MS2 spectrum can still possibly come from a mixture of isobaric peptides with the same amino acid sequence or the same PTM composition but with different PTM site occupations. In current search engines, these chimera spectra will in the best case not match anything but in the worst case lead to a false identification. An application called ChimeraCounter was developed to evaluate the effect of “chimera” spectra.143 It was found that chimera spectra reduce database search scores most significantly when contaminating fragment ion intensities exceed 20%. The identification rate is 2-fold lower than single peptide spectra. Bern et al. developed a program called DeMux to deconvolute mixed spectra and improve the peptide identification rate by approximately 25%.144 Wang et al. capitalized on available spectral libraries for the identification of mixed spectra generated from more than one peptide. As a result, they can identify up to 98% of all “chimera” spectra from equally abundant peptides and automatically adjust to varying abundance ratios up to 10:1. Their strategy also can speed up the database search process over 5 orders of magnitude. Quantitation Information Extraction. Increasingly, proteomic experiments require the relative quantitation of proteins. The quantitation information needs to be extracted before or after the peptide identification. The standard analytical approach for quantitation utilizes the elution chromatographic peak area. This can be done manually when dealing with a few analytes. However, in proteomics, up to a few 100 000 features might need to be quantified. Obviously, this cannot be done manually anymore. Then, the difficulty lies in how to reconstruct all the chromatographic peaks, extract the area under the peaks accurately, and derive a measure of confidence from the raw data. This gets further complicated by the need to normalize between samples to correct for deviations brought by the sample and analytical fluctuations. Although mass spectrometers can generate a massive amount of raw quantitation information, the bottleneck remains the lack of software to analyze these data sets. Fortunately, we have seen some software introduced in the past few years. For example, FLEXIQuant can be used for experiments in which absolute quantitation of isotopically labeled protein and peptide with PTM is of interest.145 Another algorithm called ICPLQuant can extract quantitative information using MS scan from LC-MALDI and peptide mass fingerprint experiment.146 MaXIC-Q Web is an quantitation tool that can deal with various labeling strategies from all popular platforms.147 A concept of projected ion mass 4416
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry spectrum (PIMS) is coined to represent the elution profile of each ion. Maxquant is another very popular software package for quantitative proteomics using the LTQ-Orbibrap platform. It was initially designed for high accurate quantitative analysis for SILAC labeling.148 In the latest implementation, it provides an independent database searching engine and supports label free quantitation. As well, new features were added such as the search algorithm for ions fragmentation and ETD/HCD, as well as a second peptide read out from a single MS2 spectrum.64 Maxquant has a suit of algorithms for isotopic peak detection, mass calibration, statistic evaluation, alignment between LC runs, and protein quantitation. It is especially suitable for a large scale data set. In quantitative proteomics, the normalization of signal among different experiments is an important problem regardless of the labeling strategy employed. Many factors may create overall fluctuation of signals for proteins across different biological and technical repeats. These fluctuations may come from differences in protein extraction efficiency, protein digestion, chromatography elution, ionization efficiency, etc. Griffin et al. described a normalized, label free quantitative method to combine three MS abundance features: peptide counts, spectral counts, and fragment ion intensity into a spectra index (SIn) that reflects the protein abundance.149 The index of SIn provides a simple and useful tool for label free quantitation.150 Proteomics Databases. In the early days of proteomics, publications contained relatively limited information on the mass spectrometric results. Although large data sets were being produced and reported, the information remained primarily with the individual research group. More recently, the emphasis has shifted toward making all the data sets publicly available. This is possible because of increased Internet connectivity, cheaper data storage, and more importantly the systematic efforts to develop tools and repositories. For example, the online database PHOSIDA (http://www.phosida.com)151 is a repository for different PTMs obtained from a high resolution data set and is growing rapidly. It contains PTM data of phosphorylated, N-glycosylated, and acetylated sites from nine different species. At the time of writing this article, it contained over 70 000 phosphorylation sites, 3000 acetylation sites, and 7000 N-glycosylation sites. As well, associated with PHOSIDA are prediction and motif analysis tools. Phosphomouse is another phosphoprotein database which contains tissue specific expression information based on the work of Huttlin et al.25 PhosphositePlus (http://www.phosphosite. org) is a large database with multiple modifications and data sources. To date, it contains over 95 000 nonredundant phosphorylation sites and over 10 000 other post-translational modifications. The repository of mass spectrometric data sets is also possible through the PRIDE repository (http://www.ebi.ac.uk/ pride/init.do), the Global Proteome Machine Database (http:// gpmdb.thegpm.org/), the Peptide Atlas (http://www.peptideatlas.org/), and other sites. ProDaC was created to develop documentation and storage standards, setup standardized data submission pipeline, and collect data.152 The Pride-converter is a tool that makes it straightforward to submit proteomics data to PRIDE from most common data formats.153 The rapid acceptance of data repository has been astounding. For example, the PRIDE core version 2.8.8 contains over 4.5 million identified proteins, nearly 24 million identified peptides, and over 135 million spectra. Although it is still early in terms of what can be done with this massive amount of information, we have already seen concrete examples of the usefulness of such repositories, the foremost being the development of SRM/MRM assays.
REVIEW
Read out More from MS. A stress test was made by the HUPO Test Sample Working Group.154 The sample was composed of equimolar purified proteins, with each protein containing at least one unique tryptic peptide. Among all the 27 laboratories that participated in the stress test, only one was able to report all of the tryptic peptides, and seven could report all of the 20 proteins in the stress test. This was clearly a significant problem that pointed to interlaboratory issues in sample handling, analytical techniques, and database search approaches. Interestingly, after collecting and reanalyzing the raw files from the lab, the situation improved. This means that the raw data sets did include information on most proteins; however, the difference in the performance of the algorithms used for data processing and database searching led to variation in the number of proteins identified. As well, the data acquisition strategy can be improved by including other features such as mass lists containing m/z, charge state, and retention time.155 It was reported that 24% more proteins can be quantified using the targeted data acquisition approach, and the precision of quantitation improved by >1.5-fold. Another strategy, called Index-ion triggered MS2 Ion quantitation, permits the reproducible acquisition of full MS2 spectra of targeted peptides independent of their ion intensities.156 Another important issue with most current proteomics workflow is the low number of MS2 spectra that match to database entries. The MS2 spectra that do not match with entries in protein databases can be peptides with unknown PTM state, with mutations,157 with different splice form, with different fragmentation pattern or “Chimera” spectra,143 and of lower quality or are not from peptides. The current shotgun proteome searching relies on databases that only contain known protein sequences and handles PTMs as variable modifications. Moreover, these databases rarely include SNPs and mutation. This will be an issue considering the current effort from the 1000 Human Genome Project.158 Already, 5 million SNPs and 1 million short insertions/deletions were reported by this project. How this information will be incorporated in proteomic strategies is still unclear. To confront all these issues, we will need to improve the search algorithm and to perfect the database content. A few attempts have been reported to address some of these issues. For example, identification of two peptides from the single ms2 spectrum,143,144,159 identification of mutations,160 and prediction of novel modifications161 were reported. Moreover, a new algorithm call PILOT_PTM was developed for untargeted identification of PTM.162
’ PROTEOMIC APPLICATIONS The rapid development of novel technologies and bioinformatic tools is also reflected with the number of proteomic applications reported. It is clear that many proteomic laboratories around the world are capable of generating a massive amount of information for different biological systems. However, in general, it is less clear what is being done with this information. Very few proteomic studies included or are followed up by biological validation of the results. Is proteomic going to be defined as a massive data generation exercise or as a new path to better understand biological processes and to define a new biological hypothesis? Clearly, defining the future of proteomics in terms of data generation is easy, and the community just needs to keep on putting samples on mass spectrometers. However, 4417
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry defining the future of proteomics in terms of better understanding biological processes and generating new biological hypotheses, in our opinion, might be more fruitful. This will require more refined experimental designs and follow-up biological validations. Proteome Profiling. Large scale proteome profiling is ongoing in most proteomic laboratories with an increasing number of proteins and post-translational modifications being reported. It is not feasible to review all of the reports on proteome profiling from the last two years. Instead, we selected to highlight a few reports that illustrate the trends in the field. The most studied post-translational modification by proteomics is protein phosphorylation, and large numbers of phosphorylations were reported. In particular, Wisniewski et al. performed an in-depth analysis of phosphorylation sites in mouse brain.2 Several enrichment and fractionation methods were combined resulting in 12 035 phosphorylation sites on 4579 brain proteins of which 8446 were novel sites. Huttlin et al. performed proteomic and phosphoproteomic characterization of nine mouse tissues.25 Proteins (12 039) were identified, of which 36 000 phosphorylation sites were from 6296 phosphoproteins. Other post-translational modifications have also seen a drastic increase in the number of sites reported. For example, Cao et al. used hydrophilic affinity (HA) and hydrazide chemistry (HC) to capture N-glycosylated peptides and identified 300 different glycosylation sites from 194 unique glycoproteins.163 A larger scale N-linked glycosylation profiling was performed in which lectin was used to enrich the glycosylated peptides, and 18O labeling by PNGase was used to identify the modification sites.164 The mass spectrometric analysis was done by the newest LTQ-Orbitraq Velos. They found 6367 N-glycosylation sites on 2352 proteins from mouse brain, liver, kidney, heart, and blood plasma. The application of proteomics to better understand stem cell biology has attracted more attention. Gu et al. profiled the cell surface proteins of the stem cells. The cell surface proteins were purified by biotin labeling.165 About 1000 membrane and secreted proteins were identified using this approach. Li et al. also performed a large scale phosphoproteome analysis of the mouse embryonic stem cells.166 They detected 4581 proteins and 3970 high-confidence distinct phosphosites in 1642 phosphoproteins. They also found 39 novel phosphosites from 22 prominent phosphorylated stem cell marker proteins. Our lab recently reported a large scale quantitative analysis of human embryonic stem cells.167 The quantitative changes in over 2000 proteins from 50 000 human embryonic stem cells were reported. Biomarker Discovery. Biomarker discovery has been the best example of overpromising and under delivering in proteomics. Many research groups have attempted to find novel biomarkers using the proteomic tools of the day against very complex biological fluids. Body fluids, such as serum/plasma,168 urine,169 and cerebrospinal fluid170 are considered easily accessible samples for biomarker discovery. In most instances, the technology is just not able to provide the dynamic range and handle the protein complexity present in biological fluids such as plasma and serum. Moreover, this is further compounded by issues in terms of sample handling and follow-up validation. Surface enhanced laser desorbtion/ionization time-of-flight mass spectrometry (SELDITOF-MS or SELDI), which 10 years ago was paraded as the future of biomarker discovery, has led to many potential biomarkers. However, none of the potential biomarkers discovered by this approach have entered routine clinical practice.171
REVIEW
Bottom-up proteomics has also been used for biomarker discovery. The high-abundance proteins inside these body fluids, such as complement factors in the plasma, have been extensively reported as probable biomarkers. For example, in the last two years, roughly 150 articles were reported that included proteins from the complement systems as biomarkers or putative/potential biomarkers. To illustrate the issue, the protein complement C3 has been reported by proteomics to be a potential marker in depression,172 nonsmall-cell lung cancer,173 HIV-associated neurocognitive disorders,174 idiopathic pulmonary arterial hypertension,175 and Alzheimer disease.176 The specificity of these biomarkers is under question and will need to be assessed. People have proposed that the combination of different potential biomarkers will provide the necessary specificity. Here again, we have not seen any proteomic reports that clearly demonstrate that possibility. Overall, the jury is still out on the ability of proteomics to discover biomarkers that are specific to a particular disease. It might just be that better technologies are needed to approach this problem. To date, almost all the proteomic technologies applied to biomarker discovery consisted of sample processing, LC-MS methods, and bioinformatic analysis.177 More recently, different approaches have been proposed. For example, Li et al. used subcellular proteomics and bioinformatic analysis to compare the differences in protein expression between lung cancer cell line and human bronchial epithelial cells to investigate the mechanism of epithelialmesenchymal transition phenotype in lung cancer.178 SRM/MRM assays have also been proposed as a more sensitive approach for biomarker studies.179 Keshishian et al. demonstrated that simple sample processing and stable isotope dilution-multiple reaction monitoring SIDMRM-MS can be used to readily configure multiplexed assays to quantitate clinically relevant proteins in patient plasma with concentrations that span 4 orders of magnitude.177b To bridge the gap between biomarker discovery and validation, Whiteaker et al. developed a technique, named stable isotope standards with capture by antipeptide antibodies (SISCAPA) coupled to multiple reaction monitoring (MRM) mass spectrometry, which provides an attractive alternative to ELISA for building large numbers of quantitative assays.177a Functional Proteomics. It is well-known that post-translational modifications (PTMs) of histones play a critical role in the control of gene transcription and epigenetics. The combinations of histone modifications, which are also called histone code, stores rich information with respect to gene regulation. However, it is very difficult to systematically study the histone code, because of its variation even from cell to cell. Young et al. developed a high throughput method to characterize combinatorial histone code.180 They utilized “saltless” pH gradient weak cation exchangehydrophilic interaction liquid chromatography. The configuration has good resolution, can resolve the isobaric trimethyl and acetyl modifications, and therefore is capable of identifying all of the major combinatorial histone codes present in a sample with the help of ETD. They reported over 200 H3.2 forms and 70 H4 forms, including some remarkably highly modified forms, like H3.2 K4me3K9acK14acK18acK23acK27acK36me3. Jung et al. focused on the dynamic methylation and acetylation on Lys-27 and Lys-36 of histone H3.2 and H3.3 in mouse embryonic stem cells.181 They found that the reduction in H3K27 methylation and increase in H3K27 acetylation was accompanied by H3K36 acetylation and methylation. 4418
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry
REVIEW
Figure 3. Flowchart of chemical proteomics. In activity based protein profiling (ABPP), the reagent contains a reactive group, a linker, and a tag; while in compound-centric chemical proteomics (CCCP) the reagent contains a bioactive compound, a linker, and a sepharose or agarose bead. The target proteins can be identified by comparing proteins from a drug pulldown experiment (positive matrix) to proteins from a control experiment with an inactive drug analog (negative matrix). The interaction can be verified by label-free or labeling quantification strategies, like SILAC at the protein level, or by ICAT after affinity enrichment, and iTRAQ, dimethyl labeling, or 18O labeling at the peptide level.
To date, most proteomic studies have not taken into account gene copy number. Gene copy number variation can be inherited or introduced through mutations (for example, in cancer). However, it is still not known to what extent the gene copy number variation affects the protein expression level. Geiger et al. performed a large scale proteomics gene copy number analysis for cancer development.182 They compared the protein expression level of cancer cells and normal cells in the context of the gene copy numbers. They found that only a small part of the changes in protein expression level can be explained by the gene copy number variation. ProteinDNA/RNA interaction is another intermacromolecule interaction besides proteinprotein interaction, which plays key roles in gene transcription and protein translation. Right now, the affinity purification is still the main method for this kind of interaction analysis. The mChip technology in our lab is a useful tool to analyze DNA related protein network,183 which will be discussed later. A pull-down strategy using DNA sequence as bait combined SILAC labeling was performed to capture transcription factor for the analysis of proteinDNA interaction.184 The interaction could be verified by the protein ratios between sample and control (specific and nonspecific interactions). The strategy identified several transcription factors that have not been previously reported to be present on the fully methylated CpG island upstream of the human metastasis associated 1 family, member 2 gene promoter. Similar methods could also be used to explore for RNAprotein interaction.185 Chemical Proteomics. Chemical proteomics is a promising tool in biology for post-translational modification detection and enzyme profiling. Chemical proteomics is also one of the most
powerful approaches for investigating interactions between small drug molecules/natural compounds and target proteins for discovering the role of small drug molecules/natural compounds in the biological systems. There are two major alternative approaches in chemical proteomics: activity based protein profiling (ABPP) and compound-centric chemical proteomics (CCCP)186 (Figure 3). ABPP usually consists of three major elements: a reactive group, which usually contains an electrophile that can form a covalent bond to a nucleophilic residue in the active site of an active enzyme; a tag, which is applied for detecting or enriching target proteins from the complex protein mixture, usually either as a reporter like fluorophore or as an affinity tab like biotin; a linker, which is used to link the reactive group and the tag to synthesize the chemical probe, such as a hydrophobic chain based on an alkyl unit (for example, a polyethylene glycol (PEG)). CCCP is similar to ABPP, by changing the reactive group to a bioactive compound (a small drug molecule or a natural compound) and changing the tag to a matrix, such as sepharose or agarose. ABPP is employed for profiling the enzyme activity of a set of proteins or a specific protein family in biology, while CCCP is used for investigating the molecular mechanism of action of a small drug molecule or a natural compound in drug discovery. Chemical Proteomics in Biology. Some post-translational modifications, such as O-linked N-acetylglucosamine modification, palmitylation, prenylation, and others are not readily detected by conventional proteomic methods. Instead, chemical proteomic approaches have been developed to enrich the PTMs based on specific chemical groups resulting in improved identifications. For example, Wang et al. reported a novel method for 4419
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry the identification of O-linked N-acetylglucosamine modified peptides.187 Briefly, the O-GlcNAc-modified peptides were biotinylated with a photocleavable PC-PEG-biotin-alkyne reagent by hydrazine chemistry, isolated by avidin column, cleaved by photochemistry, and analyzed by LC-MS2 using electron transfer dissociation. This approach resulted in the identification of eight O-GlcNAc sites on seven proteins (including Tau and R, β, and γ synucleins) present in a tau-enriched fraction from rat brain. Martin et al. established a large-scale protein s-palmitoylation profiling method, which used palmitic acid analog 17-octadecynoic acid (17-ODYA) for in situ metabolic labeling of palmitoylated sites, followed by reaction with rhodamineazide or biotinazide via the Cu(I)-catalyzed azidealkyne [3 þ 2] cycloaddition reaction (click chemistry). They identified roughly 125 predicted palmitoylated proteins, including G protein receptors and a family of uncharacterized hydrolases FAM108 proteins.188 They also established that the palmitoylation of FAM108 proteins is important for their anchoring to the plasma membrane. Nomura et al. applied activity based protein profiling (ABPP) using serine hydrolase-directed fluorophosphonate activity based probes to identify hydrolytic enzyme activities involved in cancer pathogenesis by comparing different cell lines. Out of the 50 serine hydrolases observed, they showed that the enzyme monoacylglycerol lipase (MAGL), which regulates a fatty acid network, is highly expressed in aggressive human cancer cells and primary tumors.189 Weerapana et al. reported an approach, named isoTOP-ABPP (isotopic tandem orthogonal proteolysis activity based protein profiling), for quantitative analysis of native cysteine reactivity. In this approach, the proteome is first labeled with electrophilic iodoacetamide (IA) probe which labels cysteine residues; this is then followed by the attachment by click-chemistry of a light or heavy tobacco etch virus(TEV) tag which included biotin. The TEV tag is used for the avidin enrichment of the labeled proteins on beads. This is followed by trypsin digestion which releases the nonattached peptides and then TEV digestion which releases the attached peptides.190 Mass spectrometry was used to quantitate the changes in the probe-labeled cysteines. Using this approach, they identified over 800 probelabeled cysteines a subset of which were hyper-reactive cysteines. Chemical proteomics approaches that rely on detection techniques other than mass spectrometry have also been developed. For example, Nguyen et al. described a novel approach for protein prenylation which uses a set of engineered protein prenyltransferases that can transfer a functionalized isoprenoid biotin-geranylpyrophosphate (BGPP), to protein substrates enabling a quantitative proteome-wide analysis of protein prenylation.191 Charron et al. developed an approach for fluorescent detection of fatty-acylation of proteins using alkynyl-fatty acid chemical reporters, for the fluorescence detection of fattyacylated proteins in mammalian cells.192 Chemical Proteomics in Drug Discovery. Another interesting application of chemical proteomics is for the discovery of potential targets of drugs. In this approach, a drug is modified to incorporate a tag that can be used for affinity purification. This can be done by adding the tag directly to the drug or in a two step process using click chemistry. For example, recently, Fleischer et al. used chemical proteomics to identify the target of a drug called CB30865 which is an effective cytotoxic compound of unknown mechanism.193 They used the drug-immobilized on beads incubated with cell extracts ( free drugs to purify potential drug targets. Using mass spectrometry, they found that nicotin-
REVIEW
amide phosphoribosyltransferase (Nampt), an enzyme performing the first step of nicotinamide conversion to NAD, was one target of CB30865. The reverse application in which a protein target of interest is screened against immobilized compound libraries has also been reported. For example, Miyazaki et al. studied the human protein pirin for its association with cancer malignancies. Although the function of pirin is still unknown, it binds to Bcl3 (protooncogene). To find a potential small molecule that binds to pirin, they screened a chemical array of 20 000 compounds for binding of pirin. From this array, they identified a pirin inhibitor, named triphenyl compound A (TPh A). They showed that this compound inhibits the interaction between pirin and Bcl3 and inhibits migration of melanoma cells through suppressing SNAI2 expression.194 Top-down Proteomics. Top-down proteomics strategy deals with intact proteins with structural diversity and broad range of physical and chemical properties and can provide a better understanding of proteins’ expression level and PTM information in contrast to bottom-up strategy.195 High resolution, high mass accuracy, and sensitivity of mass spectrometry are needed for top-down strategy. Currently, this strategy is limited to proteins with MW < 50 kDa.196 Recently, top-down proteomics was applied in biomarker discovery,197 protein structural characterization,198 the study of PTMs,199 protein domaindomain interaction,200 and protein quantitation.201 The technological development in top-down proteomics focused on the following areas: (1) Protein compatible LC separation for high throughput and sensitive analysis. For example, Mohr et al. demonstrated that monolithic columns have clearly superior protein recovery and lower carryover than silica based stationary phases, in a mass range covering 5.7150 kDa.202 (2) Protein ionization, MS detection, and fragmentation methods. For example, the infrared multiphoton dissociation (IRMPD) was implemented in a dual pressure linear ion trap, facilitating more accurate mass identification and streamlining product ion assignment.203 It was also reported that ETD or ECD is more suitable for top-down analysis than CID because it can provide more fragments for large proteins and the protein PTM is more stable in this type of dissociation.204 (3) Bioinformatics tools. Liu et al. developed a combinatorial algorithm for spectra deconvolution, which improved the Thrash and Xtract number of correctly recovered monoisotopic masses and speed.205 (4) Integrating top-down and bottom-up strategy. Pflieger et al. combined bottom-up and top-down analyses for comprehensive proteomics study of C1q (a subunit of the C1 complex) and demonstrated the usefulness of combining the two complementary analytical approaches to obtain a detailed characterization of the post-translational modification pattern.206
’ PROTEINPROTEIN INTERACTION One of the successful applications of proteomics has been for the identification of proteinprotein interactions. In previous reviews, we have extensively discussed the techniques for the identification of proteinprotein interactions such as affinity purification coupled to MS, yeast two hybrids, and other techniques.207 Protein interactions with other molecules such as lipids and metabolites are also being increasingly studied. One such example is a study by Gallego et al. that identified 530 4420
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry proteins interacting with lipid molecules; most of these interactions were novel.208 Another example is a large scale study by Li et al. that looks at the interaction between proteins and metabolites.209 Here, we will focus on new trends in studying proteinprotein interactions especially the use of affinity purification combined with quantitative mass spectrometry to identify protein interaction networks. Large Scale Protein Interaction Networks. As the field of protein interaction mapping matures, more emphasis has been given to the validation and scoring of interaction data.210 Recent large scale interaction studies have employed more rigorous experimental design protocols, independent assay validation, and statistical analysis to avoid the high false positives associated with the earlier large scale interaction studies.211 As well, studies have been focused on either reanalyzing interaction data sets or predicting protein interactions using a computer model based on structure, sequence, or gene expression.212 Another trend in protein interaction mapping has been the study of subgroups of proteins. For example, we recently reported interaction networks of chromatin related proteins from Saccharomyces cerevisiae. Interaction networks of these proteins were studied using the mChip, a technique developed to enrich for chromatin bound protein interactions. This technique is very similar to conventional chromatin IP protocols with the exception that gentle sonication and mild clarification of the cellular lysate are used to preserve protein DNA complexes.183 Originally, the mChip was used to identify protein interactions of H2A and Htz1 and was then expanded to 102 chromatin related proteins.213 Raw data generated about 9000 interactions between 900 proteins. This data was further analyzed, reducing the number of interactions to 2966 between 724 unique proteins. While the data overlaps with previous data sets obtained from other protein interaction studies, many new interactions were discovered. Identification of these novel interactions is due to the gentle nature of mChip that works on preserving protein protein interactions while the bait protein remains associated with DNA. The second example was done by Breitkreutz et al. for the study of the protein interaction network of kinases and phosphatases. In this study, kinases and phosphatases of interest were affinity purified using different tags and expression systems and then digested directly on magnetic beads. In total, close to 150 proteins were tagged and immunopurified to generate 1844 unique interactions. Elimination of nonspecific interactions was done using in house software called SAINT (Significance Analysis of Interactome). This software assigned a significance value to each interaction.214 Another study published by Mak et al. introduced a new system called MAPLE (mammalian affinity purification and lentiviral expression).215 This system is an integration of a lentivirus expression system that uses Gateway technology and some of the most common epitope tags such as FLAG and His tags. Integration of these features in the expression vector allows for its expression in various cell types and is compatible with cDNA obtained from various commercial sources. It also allows for stable integration of the expression cassette generating an expression that is comparable to endogenous protein expression level. The system was tested on the RNA polymerase II complex as well as other transcription/chromatin related proteins. One of these proteins was KLF4, one of the important transcription factors required to induce pluripotency in most cell types.216 By studying this protein and other proteins associated with it, the
REVIEW
authors were able to unravel a new role for Klf4 during cell reprogramming through chromatin remodeling.215 Quantitative Protein Interaction Studies. SILAC is the most popular quantitative technique used in combination with AP-MS, as it allows for the labeling of proteins in cell cultures. The SILAC labeling allows the affinity purification step to be performed on a protein mixture derived from the control and the bait sample instead of doing two separate affinity purification steps. This also reduces the variability between the two samples and allows for fewer stringent washes. Baker et al. utilized this method to study the static and the dynamic interactome and change in phosphorylation status of a circadian rhythm protein called FREQUANCY (FRQ).217 The authors were able to show that FRQ interaction with other proteins is dependent on the circadian rhythm and its interactome changes depending on the time of the day. On the other hand, FRQ is a heavily phosphorylated protein that is phosphorylated at more than 75 residues. Most of these phosphorylation events are located within two distinct regions in its sequence.217 This study serves as a good example of the advantage of combining quantitative interaction studies with identification of PTMs to better understand the function of a protein of interest. Another example of this approach is a study performed by Xu et al. on the role of unconventional ubiquitin chains in protein degradation.218 A less popular approach involves the separate purification of protein complexes prior to their combination. Even though this technique allows for combining the AP step from any quantitative proteomic techniques, it risks the introduction of experimental variations. This approach was used for the identification of RAD52 protein complex.219 The use of label-free quantitation using spectra counts offers an alternative approach to the use of isotope labeling. The main advantage of this technique is the lower cost. For example, Sardiu et al. combined computational and quantitative proteomics approaches to study the interactome of Saccharomyces cerevisiae Rpd3 protein complex.220 In this study, 11 subunit proteins of the Rpd3 complex were purified generating 429 protein interactors that were reduced to 89 protein interactors after applying singular value decomposition to the data set.
’ PERSPECTIVES Proteomics continues to see widespread applications across different areas of biology from the study of fundamental biological processes to the study of animal models of diseases, clinical samples, and biomarker discovery. These applications are sustained by the development of proteomic technologies, mass spectrometry, and bioinformatics. The large-scale application of proteomics for global profiling is generating a massive amount of information which, thanks to data repositories, is available in open format. Clearly, our ability to generate large-scale proteomic data sets outpaces our ability to validate the roles of all of the interesting proteins. Up until recently, most proteomic studies only reported lists of proteins (either identification or quantitation) with very little, if any, biological validation. More recently, we have seen an increase in the number of papers that combine proteomics with biological validation of a handful of proteins. These reports consistently highlight the fact that global profiling by proteomics, when used in a well-defined experiment, can lead to the generation of new hypotheses testable using classical biochemical approaches. Proteomics is also fueling the creations of data repositories from raw proteomic data sets, 4421
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry quantitative proteomics, post-translational modifications, and proteinprotein interactions. Although the tools to combine these different data sets are still limited, there has been an increase in combining different proteomics/genomics data sets to help in the interpretation of proteomic results. For example, quantitative proteomic results can be mapped on protein interaction networks and pathways to highlight enriched network/ pathways and pinpointing proteins and complexes of interest for further biological experiments. In many instances, the interest is not the global picture of the proteome but instead the roles and functions of very specific proteins that represent targeted biological questions.221 Here, each proteomics experiment represents a biological question involving a limited number of partners. These experiments can be repeated over multiple target proteins of interest and in combination can form large data sets, for example protein protein interactions. We expect to see a continuous increase in the large-scale mapping of PTM, such as phosphorylation, glycosylation, acetylation, and methylation. Although the functions of most of the reported PTM sites remain unknown, they represent an invaluable resource for the biological community. Moreover, we also expect that combined quantitative proteomic studies of PTM and protein levels on global or targeted scales will help pinpoint proteins for further biological studies. Finally, we also anticipate that technology development will remain the driving force of proteomics for the foreseeable future, in particular, the development of enrichment technologies for a specific subgroup of proteins, PTMs, and subcellular proteomics and technologies for minute amounts of proteins, as well as the development of bioinformatic tools to integrate different experiments, mine data repositories, and data representation.
’ AUTHOR INFORMATION Corresponding Author
*Phone: 613-562-5800 ext 8674. Fax: 613-562-5655. E-mail: dfi
[email protected].
’ BIOGRAPHIES Zhibin Ning obtained his B.S. degree in life science at Shandong Normal University, China, in 2003. He received his Ph.D. degree in biotechnology and biochemistry from Shanghai Institutes for Biological Sciences in 2008 for the development and applications of liquid based separation strategies. He is presently a postdoctoral fellow in Ottawa Institute of Systems Biology, University of Ottawa, under the guidance of Prof. Daniel Figeys. He is focusing on technology development and applications in proteomics and lipidomics. Hu Zhou obtained his B.S. degree in biology at Nankai University, China, in 2001. He received his Ph.D. degree in biochemistry and molecular biology from Shanghai Institutes for Biological Sciences for method developments in liquid chromatography and mass spectrometry in the summer of 2007. He is presently working as a postdoctoral fellow for Professor Daniel Figeys at the Ottawa Institute of Systems Biology, University of Ottawa, and is focused on technology developments of proteomics, lipidomics, and disease-related proteomics. Fangjun Wang completed his B.S. degree at Zhejiang University in 2005, China. He is currently a Ph.D. candidate at the Dalian Institute of Chemical Physics, Chinese Academy of
REVIEW
Sciences, Dalian, China, under the supervision of Prof. Hanfa Zou. He is presently involved in a collaborative research initiative for developing new methods for quantitative proteomics and the applications in Alzheimer disease under the guidance of Prof. Daniel Figeys at the Ottawa Institute of Systems Biology, University of Ottawa. Mohamed Abu-Farha completed his Honors B.S. degree in biochemistry and biotechnology at Carleton University. He also obtained his M.S. from the Biology Department at Carleton University. He finished his Ph.D. under Professor Daniel Figeys’s supervision at the University of Ottawa in the Department of Biochemistry, Microbiology and Immunology. He was recognized as an NSERC scholar during his Ph.D. studies. Currently, he is working on studying chromatin modifying enzymes using proteomics and molecular biology techniques. Daniel Figeys is a professor in the Department of Biochemistry, the Director of the Ottawa Institute of Systems Biology, and a Tier-1 Canada Research Chair in proteomics and systems biology. Daniel obtained a B.S. and a M.S. in chemistry from the Universite de Montreal. He obtained a Ph.D. in chemistry from the University of Alberta and did his postdoctoral studies at the University of Washington. Prior to his current position, Daniel was Senior VP of Systems Biology with MDS-Proteomics. From 1998 to 2000, he was a Research Officer at the NRC-Canada. Daniel’s research involves developing proteomics technologies and their applications in systems biology.
’ ACKNOWLEDGMENT Z.N. and H.Z. are co-first authors of this Review. F.W. and M. A.-F. contributed equally to this Review. The authors thank Maroun Bou-khalil and Deeptee Seebun for editorial help. D.F. acknowledges a Canada Research Chair in Proteomics and Systems Biology. ’ REFERENCES (1) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat. Methods 2009, 6, 359–362. (2) Wisniewski, J. R.; Nagaraj, N.; Zougman, A.; Gnad, F.; Mann, M. J. Proteome Res. 2010, 9 (6), 3280–3289. (3) Duan, X.; Young, R.; Straubinger, R. M.; Page, B.; Cao, J.; Wang, H.; Yu, H.; Canty, J. M.; Qu, J. J. Proteome Res. 2009, 8, 2838–2850. (4) Kadiyala, C. S.; Tomechko, S. E.; Miyagi, M. PLoS One 2010, 5, e15332. (5) Yang, H. J.; Hong, J.; Lee, S.; Shin, S.; Kim, J. Rapid Commun. Mass Spectrom. 2010, 24, 901–908. (6) Bao, H.; Lui, T.; Zhang, L.; Chen, G. Proteomics 2009, 9, 1114–1117. (7) Hahn, H. W.; Rainer, M.; Ringer, T.; Huck, C. W.; Bonn, G. K. J. Proteome Res. 2009, 8, 4225–4230. (8) Yamaguchi, H.; Miyazaki, M.; Honda, T.; Briones-Nagata, M. P.; Arima, K.; Maeda, H. Electrophoresis 2009, 30, 3257–3264. (9) Spross, J.; Sinz, A. Anal. Chem. 2010, 82, 1434–1443. (10) Liu, T.; Bao, H.; Chen, G. Electrophoresis 2010, 31, 3070–3073. (11) Ma, J.; Hou, C.; Liang, Y.; Wang, T.; Liang, Z.; Zhang, L.; Zhang, Y. Proteomics 2011, 11 (5), 991–995. (12) Yuan, H.; Zhang, L.; Hou, C.; Zhu, G.; Tao, D.; Liang, Z.; Zhang, Y. Anal. Chem. 2009, 81, 8708–8714. (13) Sun, L.; Ma, J.; Qiao, X.; Liang, Y.; Zhu, G.; Shan, Y.; Liang, Z.; Zhang, L.; Zhang, Y. Anal. Chem. 2010, 82, 2574–2579. (14) Percy, A. J.; Schriemer, D. C. Anal. Chim. Acta 2010, 657, 53–59. (15) Zhou, H.; Hou, W.; Lambert, J. P.; Tian, R.; Figeys, D. Talanta 2010, 80, 1526–1531. 4422
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry (16) Lin, W.; Skinner, C. D. J. Sep. Sci. 2009, 32, 2642–2652. (17) Zhou, H.; Hou, W.; Lambert, J. P.; Figeys, D. Anal. Bioanal. Chem. 2010, 397, 3421–3430. (18) Ma, J.; Hou, C.; Sun, L.; Tao, D.; Zhang, Y.; Shan, Y.; Liang, Z.; Zhang, L.; Yang, L. Anal. Chem. 2010, 82, 9622–9625. (19) Zeisbergerova, M.; Adamkova, A.; Glatz, Z. Electrophoresis 2009, 30, 2378–2384. (20) Liuni, P.; Rob, T.; Wilson, D. J. Rapid Commun. Mass Spectrom. 2010, 24, 315–320. (21) Pereira-Medrano, A. G.; Forster, S.; Fowler, G. J.; McArthur, S. L.; Wright, P. C. Lab Chip 2010, 10, 3397–406. (22) Zhou, H.; Elisma, F.; Denis, N. J.; Wright, T. G.; Tian, R.; Hou, W.; Zou, H.; Figeys, D. J. Proteome Res. 2010, 9, 1279–1288. (23) Tian, R.; Wang, S.; Elisma, F.; Li, L.; Zhou, H.; Wang, L.; Figeys, D. Mol. Cell. Proteomics 2011, 10, M110 000679. (24) Gerrits, B.; Bodenmiller, B. Methods Mol. Biol. 2010, 658, 127–136. (25) Huttlin, E. L.; Jedrychowski, M. P.; Elias, J. E.; Goswami, T.; Rad, R.; Beausoleil, S. A.; Villen, J.; Haas, W.; Sowa, M. E.; Gygi, S. P. Cell 2010, 143, 1174–1189. (26) Brill, L. M.; Xiong, W.; Lee, K. B.; Ficarro, S. B.; Crain, A.; Xu, Y.; Terskikh, A.; Snyder, E. Y.; Ding, S. Cell Stem Cell 2009, 5, 204–213. (27) Ye, J.; Zhang, X.; Young, C.; Zhao, X.; Hao, Q.; Cheng, L.; Jensen, O. N. J. Proteome Res. 2010, 9, 3561–3573. (28) Hu, L.; Zhou, H.; Li, Y.; Sun, S.; Guo, L.; Ye, M.; Tian, X.; Gu, J.; Yang, S.; Zou, H. Anal. Chem. 2009, 81, 94–104. (29) (a) Han, G.; Ye, M.; Jiang, X.; Chen, R.; Ren, J.; Xue, Y.; Wang, F.; Song, C.; Yao, X.; Zou, H. Anal. Chem. 2009, 81, 5794–5805. (b) Wang, F.; Han, G.; Yu, Z.; Jiang, X.; Sun, S.; Chen, R.; Ye, M.; Zou, H. J. Sep. Sci. 2010, 33, 1879–1887. (30) Song, C.; Ye, M.; Han, G.; Jiang, X.; Wang, F.; Yu, Z.; Chen, R.; Zou, H. Anal. Chem. 2010, 82, 53–56. (31) Baker, M. A.; Smith, N. D.; Hetherington, L.; Pelzing, M.; Condina, M. R.; Aitken, R. J. J. Proteome Res. 2011, 10 (3), 1004–1017. (32) Manes, N. P.; Dong, L.; Zhou, W.; Du, X.; Reghu, N.; Kool, A. C.; Choi, D.; Bailey, C. L.; Petricoin, E. F.; Liotta, L. A.; Popov, S. G. Mol. Cell. Proteomics 2011, 10 (3), M110 000927. (33) Lee, H. J.; Na, K.; Kwon, M. S.; Kim, H.; Kim, K. S.; Paik, Y. K. Proteomics 2009, 9, 3395–3408. (34) Li, Q. R.; Ning, Z. B.; Tang, J. S.; Nie, S.; Zeng, R. J. Proteome Res. 2009, 8, 5375–5381. (35) Leitner, A.; Sturm, M.; Hudecz, O.; Mazanek, M.; Smått, J.-H.; Linden, M.; Lindner, W.; Mechtler, K. Anal. Chem. 2010, 82, 2726–2733. (36) (a) Lu, Z.; Duan, J.; He, L.; Hu, Y.; Yin, Y. Anal. Chem. 2010, 82, 7249–7258. (b) Eriksson, A.; Bergquist, J.; Edwards, K.; Hagfeldt, A.; Malmstr€om, D.; Hernandez, V. c. A. Anal. Chem. 2010, 82, 4577–4583. (37) Nelson, C. A.; Szczech, J. R.; Dooley, C. J.; Xu, Q.; Lawrence, M. J.; Zhu, H.; Jin, S.; Ge, Y. Anal. Chem. 2010, 82, 7193–7201. (38) Mamone, G.; Picariello, G.; Ferranti, P.; Addeo, F. Proteomics 2010, 10, 380–393. (39) Dai, J.; Wang, L. S.; Wu, Y. B.; Sheng, Q. H.; Wu, J. R.; Shieh, C. H.; Zeng, R. J. Proteome Res. 2009, 8, 133–141. (40) Nie, S.; Dai, J.; Ning, Z. B.; Cao, X. J.; Sheng, Q. H.; Zeng, R. J. Proteome Res. 2010, 9, 4585–4594. (41) (a) Dierck, K.; Machida, K.; Mayer, B. J.; Nollau, P. Methods Mol. Biol. 2009, 527, 131–155. (b) Boersema, P. J.; Foong, L. Y.; Ding, V. M.; Lemeer, S.; van Breukelen, B.; Philp, R.; Boekhorst, J.; Snel, B.; den Hertog, J.; Choo, A. B.; Heck, A. J. Mol. Cell. Proteomics 2010, 9, 84–99 . (42) Raijmakers, R.; Kraiczek, K.; de Jong, A. P.; Mohammed, S.; Heck, A. J. Anal. Chem. 2010, 82, 824–832. (43) Dong, M.; Wu, M.; Wang, F.; Qin, H.; Han, G.; Dong, J.; Wu, R.; Ye, M.; Liu, Z.; Zou, H. Anal. Chem. 2010, 82, 2907–2015. (44) Torta, F.; Fusi, M.; Casari, C. S.; Bottani, C. E.; Bachi, A. J. Proteome Res. 2009, 8, 1932–1942. (45) (a) McNulty, D. E.; Annan, R. S. Methods Mol. Biol. 2009, 527, 93–105. (b) Saleem, R. A.; Rogers, R. S.; Ratushny, A. V.; Dilworth,
REVIEW
D. J.; Shannon, P. T.; Shteynberg, D.; Wan, Y.; Moritz, R. L.; Nesvizhskii, A. I.; Rachubinski, R. A.; Aitchison, J. D. Mol. Cell. Proteomics 2010, 9, 2076–2088. (46) Wollscheid, B.; Bausch-Fluck, D.; Henderson, C.; O’Brien, R.; Bibel, M.; Schiess, R.; Aebersold, R.; Watts, J. D. Nat. Biotechnol. 2009, 27, 378–386. (47) Gundry, R. L.; Raginski, K.; Tarasova, Y.; Tchernyshyov, I.; Bausch-Fluck, D.; Elliott, S. T.; Boheler, K. R.; Van Eyk, J. E.; Wollscheid, B. Mol. Cell. Proteomics 2009, 8, 2555–2569. (48) Chen, R.; Jiang, X.; Sun, D.; Han, G.; Wang, F.; Ye, M.; Wang, L.; Zou, H. J. Proteome Res. 2009, 8, 651–661. (49) Lee, A.; Kolarich, D.; Haynes, P. A.; Jensen, P. H.; Baker, M. S.; Packer, N. H. J. Proteome Res. 2009, 8, 770–781. (50) Alley, W. R., Jr.; Mechref, Y.; Novotny, M. V. Rapid Commun. Mass Spectrom. 2009, 23, 495–505. (51) Tian, R.; Ren, L.; Ma, H.; Li, X.; Hu, L.; Ye, M.; Wu, R.; Tian, Z.; Liu, Z.; Zou, H. J. Chromatogr., A 2009, 1216, 1270–1278. (52) Qi, Y.; Wei, J.; Wang, H.; Zhang, Y.; Xu, J.; Qian, X.; Guan, Y. Talanta 2009, 80, 703–709. (53) Xu, Y.; Wu, Z.; Zhang, L.; Lu, H.; Yang, P.; Webley, P. A.; Zhao, D. Anal. Chem. 2009, 81, 503–508. (54) Min, Q.; Wu, R.; Zhao, L.; Qin, H.; Ye, M.; Zhu, J. J.; Zou, H. Chem. Commun. (Cambridge) 2010, 46, 6144–6146. (55) Terracciano, R.; Casadonte, F.; Pasqua, L.; Candeloro, P.; Di Fabrizio, E.; Urbani, A.; Savino, R. Talanta 2010, 80, 1532–1538. (56) Ni, Y. G.; Condra, J. H.; Orsatti, L.; Shen, X.; Di Marco, S.; Pandit, S.; Bottomley, M. J.; Ruggeri, L.; Cummings, R. T.; Cubbon, R. M.; Santoro, J. C.; Ehrhardt, A.; Lewis, D.; Fisher, T. S.; Ha, S.; Njimoluh, L.; Wood, D. D.; Hammond, H. A.; Wisniewski, D.; Volpari, C.; Noto, A.; Lo Surdo, P.; Hubbard, B.; Carfi, A.; Sitlani, A. J. Biol. Chem. 2010, 285, 12882–12891. (57) Bouamrani, A.; Hu, Y.; Tasciotti, E.; Li, L.; Chiappini, C.; Liu, X.; Ferrari, M. Proteomics 2010, 10, 496–505. (58) Liu, S.; Chen, H.; Lu, X.; Deng, C.; Zhang, X.; Yang, P. Angew. Chem., Int. Ed. Engl. 2010, 49, 7557–7561. (59) Schmidt, A.; Gehlenborg, N.; Bodenmiller, B.; Mueller, L. N.; Campbell, D.; Mueller, M.; Aebersold, R.; Domon, B. Mol. Cell. Proteomics 2008, 7, 2138–2150. (60) Schmidt, A.; Claassen, M.; Aebersold, R. Curr. Opin. Chem. Biol. 2009, 13, 510–517. (61) Purvine, S.; Eppel, J. T.; Yi, E. C.; Goodlett, D. R. Proteomics 2003, 3, 847–850. (62) Chakraborty, A. B.; Berger, S. J.; Gebler, J. C. Rapid Commun. Mass Spectrom. 2007, 21, 730–744. (63) Williams, J. D.; Flanagan, M.; Lopez, L.; Fischer, S.; Miller, L. A. J. Chromatogr., A 2003, 1020, 11–26. (64) Geiger, T.; Cox, J.; Mann, M. Mol. Cell. Proteomics 2010, 9, 2252–2261. (65) Ramos, A. A.; Yang, H.; Rosen, L. E.; Yao, X. Anal. Chem. 2006, 78, 6391–6397. (66) Silva, J. C.; Gorenstein, M. V.; Li, G. Z.; Vissers, J. P.; Geromanos, S. J. Mol. Cell. Proteomics 2006, 5, 144–156. (67) Geromanos, S. J.; Vissers, J. P.; Silva, J. C.; Dorschel, C. A.; Li, G. Z.; Gorenstein, M. V.; Bateman, R. H.; Langridge, J. I. Proteomics 2009, 9, 1683–1695. (68) Li, G. Z.; Vissers, J. P.; Silva, J. C.; Golick, D.; Gorenstein, M. V.; Geromanos, S. J. Proteomics 2009, 9, 1696–1719. (69) Olsen, J. V.; Macek, B.; Lange, O.; Makarov, A.; Horning, S.; Mann, M. Nat. Methods 2007, 4, 709–712. (70) Olsen, J. V.; Schwartz, J. C.; Griep-Raming, J.; Nielsen, M. L.; Damoc, E.; Denisov, E.; Lange, O.; Remes, P.; Taylor, D.; Splendore, M.; Wouters, E. R.; Senko, M.; Makarov, A.; Mann, M.; Horning, S. Mol. Cell. Proteomics 2009, 8, 2759–2769. (71) Nagaraj, N.; D’Souza, R. C.; Cox, J.; Olsen, J. V.; Mann, M. J. Proteome Res. 2010, 9 (12), 6786–6794. (72) Danielsen, J. M.; Sylvestersen, K. B.; Bekker-Jensen, S.; Szklarczyk, D.; Poulsen, J. W.; Horn, H.; Jensen, L. J.; Mailand, N.; Nielsen, M. L. Mol. Cell. Proteomics 2010, 10 (3), M110 003590. 4423
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry (73) Segu, Z. M.; Mechref, Y. Rapid Commun. Mass Spectrom. 2010, 24, 1217–1225. (74) Przybylski, C.; Junger, M. A.; Aubertin, J.; Radvanyi, F.; Aebersold, R.; Pflieger, D. J. Proteome Res. 2010, 9, 5118–5132. (75) Zubarev, R. A.; Kelleher, N. L.; McLafferty, F. W. J. Am. Chem. Soc. 1998, 120, 3265–3266. (76) Pitteri, S. J.; Chrisman, P. A.; Hogan, J. M.; McLuckey, S. A. Anal. Chem. 2005, 77, 1831–1839. (77) Hennrich, M. L.; Boersema, P. J.; van den Toorn, H.; Mischerikow, N.; Heck, A. J.; Mohammed, S. Anal. Chem. 2009, 81, 7814–7822. (78) Vasicek, L.; Brodbelt, J. S. Anal. Chem. 2009, 81, 7876–7884. (79) Campbell, J. L.; Hager, J. W.; Le Blanc, J. C. J. Am. Soc. Mass Spectrom. 2009, 20, 1672–1683. (80) Ledvina, A. R.; Beauchene, N. A.; McAlister, G. C.; Syka, J. E.; Schwartz, J. C.; Griep-Raming, J.; Westphall, M. S.; Coon, J. J. Anal. Chem. 2010, 82, 10068–10074. (81) Boersema, P. J.; Raijmakers, R.; Lemeer, S.; Mohammed, S.; Heck, A. J. Nat. Protoc. 2009, 4, 484–494. (82) (a) Ji, C.; Sadagopan, N.; Zhang, Y.; Lepsy, C. Anal. Chem. 2009, 81, 9321–9328. (b) Wu, C. J.; Hsu, J. L.; Huang, S. Y.; Chen, S. H. J. Am. Soc. Mass Spectrom. 2010, 21, 460–471. (83) Raijmakers, R.; Heck, A. J.; Mohammed, S. Mol. BioSyst. 2009, 5, 992–1003. (84) Hennrich, M. L.; Mohammed, S.; Altelaar, A. F.; Heck, A. J. J. Am. Soc. Mass Spectrom. 2010, 21, 1957–1965. (85) (a) Doucet, A.; Overall, C. M. Mol. Cell. Proteomics 2010, DOI:10.1074/mcp.M110.003533. (b) Schilling, O.; Huesgen, P. F.; Barre, O.; auf dem Keller, U.; Overall, C. M. Nat. Protoc. 2011, 6, 111–120. (86) Kleifeld, O.; Doucet, A.; auf dem Keller, U.; Prudova, A.; Schilling, O.; Kainthan, R. K.; Starr, A. E.; Foster, L. J.; Kizhakkedathu, J. N.; Overall, C. M. Nat. Biotechnol. 2010, 28, 281–288. (87) Phanstiel, D.; Unwin, R.; McAlister, G. C.; Coon, J. J. Anal. Chem. 2009, 81, 1693–1698. (88) Yang, F.; Wu, S.; Stenoien, D. L.; Zhao, R.; Monroe, M. E.; Gritsenko, M. A.; Purvine, S. O.; Polpitiya, A. D.; Tolic, N.; Zhang, Q.; Norbeck, A. D.; Orton, D. J.; Moore, R. J.; Tang, K.; Anderson, G. A.; Pasa-Tolic, L.; Camp, D. G., 2nd; Smith, R. D. Anal. Chem. 2009, 81, 4137–4143. (89) Kocher, T.; Pichler, P.; Schutzbier, M.; Stingl, C.; Kaul, A.; Teucher, N.; Hasenfuss, G.; Penninger, J. M.; Mechtler, K. J. Proteome Res. 2009, 8, 4743–4752. (90) Dayon, L.; Turck, N.; Kienle, S.; Schulz-Knappe, P.; Hochstrasser, D. F.; Scherl, A.; Sanchez, J. C. Anal. Chem. 2010, 82, 848–858. (91) Boja, E. S.; Phillips, D.; French, S. A.; Harris, R. A.; Balaban, R. S. J. Proteome Res. 2009, 8, 4665–4675. (92) Thingholm, T. E.; Palmisano, G.; Kjeldsen, F.; Larsen, M. R. J. Proteome Res. 2010, 9, 4045–4052. (93) Petritis, B. O.; Qian, W. J.; Camp, D. G., 2nd; Smith, R. D. J. Proteome Res. 2009, 8, 2157–2163. (94) Mori, M.; Abe, K.; Yamaguchi, H.; Goto, J.; Shimada, M.; Mano, N. J. Proteome Res. 2010, 9, 3741–3749. (95) White, C. A.; Oey, N.; Emili, A. J. Proteome Res. 2009, 8, 3653–3665. (96) Ye, X.; Luke, B. T.; Johann, D. J., Jr.; Ono, A.; Prieto, D. A.; Chan, K. C.; Issaq, H. J.; Veenstra, T. D.; Blonder, J. Anal. Chem. 2010, 82, 5878–5886. (97) Qian, W. J.; Liu, T.; Petyuk, V. A.; Gritsenko, M. A.; Petritis, B. O.; Polpitiya, A. D.; Kaushal, A.; Xiao, W.; Finnerty, C. C.; Jeschke, M. G.; Jaitly, N.; Monroe, M. E.; Moore, R. J.; Moldawer, L. L.; Davis, R. W.; Tompkins, R. G.; Herndon, D. N.; Camp, D. G.; Smith, R. D. J. Proteome Res. 2009, 8, 290–299. (98) Roe, M. R.; McGowan, T. F.; Thompson, L. V.; Griffin, T. J. J. Am. Soc. Mass Spectrom. 2010, 21, 1190–1203. (99) Zhang, J.; Wang, Y.; Li, S. Anal. Chem. 2010, 82, 7588–7595. (100) DeSouza, L. V.; Taylor, A. M.; Li, W.; Minkoff, M. S.; Romaschin, A. D.; Colgan, T. J.; Siu, K. W. J. Proteome Res. 2008, 7, 3525–3534.
REVIEW
(101) Kang, U. B.; Yeom, J.; Kim, H.; Lee, C. J. Proteome Res. 2010, 9, 3750–3758. (102) Wang, H.; Wong, C. H.; Chin, A.; Kennedy, J.; Zhang, Q.; Hanash, S. J. Proteome Res. 2009, 8, 5412–5422. (103) Ong, S.-E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Mol. Cell. Proteomics 2002, 1 (5), 376 386. (104) de Godoy, L. M.; Olsen, J. V.; Cox, J.; Nielsen, M. L.; Hubner, N. C.; Frohlich, F.; Walther, T. C.; Mann, M. Nature 2008, 455, 1251–1254. (105) Soufi, B.; Kumar, C.; Gnad, F.; Mann, M.; Mijakovic, I.; Macek, B. J. Proteome Res. 2010, 9, 3638–3646. (106) Geiger, T.; Cox, J.; Ostasiewicz, P.; Wisniewski, J. R.; Mann, M. Nat. Methods 2010, 7, 383–385. (107) Park, S. K.; Liao, L.; Kim, J. Y.; Yates, J. R., 3rd Nat. Methods 2009, 6, 184–185. (108) Bicho, C. C.; de Lima Alves, F.; Chen, Z. A.; Rappsilber, J.; Sawin, K. E. Mol. Cell. Proteomics 2010, 9, 1567–1577. (109) Wei, R.; Li, G.; Seymour, A. B. Anal. Chem. 2010, 82, 5527– 5533. (110) Turtoi, A.; Mazzucchelli, G. D.; De Pauw, E. Talanta 2010, 80, 1487–1495. (111) Walsh, G. M.; Lin, S.; Evans, D. M.; Khosrovi-Eghbal, A.; Beavis, R. C.; Kast, J. J. Proteomics 2009, 72, 838–852. (112) Sherwood, C. A.; Eastham, A.; Lee, L. W.; Risler, J.; Vitek, O.; Martin, D. B. J. Proteome Res. 2009, 8, 4243–4251. (113) Sherwood, C. A.; Eastham, A.; Lee, L. W.; Peterson, A.; Eng, J. K.; Shteynberg, D.; Mendoza, L.; Deutsch, E. W.; Risler, J.; Tasman, N.; Aebersold, R.; Lam, H.; Martin, D. B. J. Proteome Res. 2009, 8, 4396–4405. (114) Prakash, A.; Tomazela, D. M.; Frewen, B.; Maclean, B.; Merrihew, G.; Peterman, S.; Maccoss, M. J. J. Proteome Res. 2009, 8, 2733–2739. (115) Fusaro, V. A.; Mani, D. R.; Mesirov, J. P.; Carr, S. A. Nat. Biotechnol. 2009, 27, 190–198. (116) Bertsch, A.; Jung, S.; Zerck, A.; Pfeifer, N.; Nahnsen, S.; Henneges, C.; Nordheim, A.; Kohlbacher, O. J. Proteome Res. 2010, 9, 2696–2704. (117) Cham Mead, J. A.; Bianco, L.; Bessant, C. Proteomics 2010, 10, 1106–1126. (118) Picotti, P.; Rinner, O.; Stallmach, R.; Dautel, F.; Farrah, T.; Domon, B.; Wenschuh, H.; Aebersold, R. Nat. Methods 2010, 7, 43–46. (119) Baek, J. H.; Kim, H.; Shin, B.; Yu, M. H. J. Proteome Res. 2009, 8, 3625–3632. (120) Lam, H.; Aebersold, R. Methods Mol. Biol. 2010, 604, 95–103. (121) Yen, C. Y.; Meyer-Arendt, K.; Eichelberger, B.; Sun, S.; Houel, S.; Old, W. M.; Knight, R.; Ahn, N. G.; Hunter, L. E.; Resing, K. A. Mol. Cell. Proteomics 2009, 8, 857–869. (122) Li, Y.; Chi, H.; Wang, L. H.; Wang, H. P.; Fu, Y.; Yuan, Z. F.; Li, S. J.; Liu, Y. S.; Sun, R. X.; Zeng, R.; He, S. M. Rapid Commun. Mass Spectrom. 2010, 24, 807–814. (123) Zhou, C.; Chi, H.; Wang, L. H.; Li, Y.; Wu, Y. J.; Fu, Y.; Sun, R. X.; He, S. M. BMC Bioinf. 2010, 11, 577. (124) Salmi, J.; Nyman, T. A.; Nevalainen, O. S.; Aittokallio, T. Proteomics 2009, 9, 848–860. (125) Savitski, M. M.; Mathieson, T.; Becher, I.; Bantscheff, M. J. Proteome Res. 2010, 9, 5511–5516. (126) Volchenboum, S. L.; Kristjansdottir, K.; Wolfgeher, D.; Kron, S. J. Mol. Cell. Proteomics 2009, 8, 2011–2022. (127) Gehlenborg, N.; Yan, W.; Lee, I. Y.; Yoo, H.; Nieselt, K.; Hwang, D.; Aebersold, R.; Hood, L. Bioinformatics 2009, 25, 682–683. (128) Slotta, D. J.; McFarland, M. A.; Markey, S. P. Proteomics 2010, 10, 3035–3039. (129) Liu, G.; Zhang, J.; Larsen, B.; Stark, C.; Breitkreutz, A.; Lin, Z. Y.; Breitkreutz, B. J.; Ding, Y.; Colwill, K.; Pasculescu, A.; Pawson, T.; Wrana, J. L.; Nesvizhskii, A. I.; Raught, B.; Tyers, M.; Gingras, A. C. Nat. Biotechnol. 2010, 28, 1015–1017. (130) Xu, H.; Hsu, P. H.; Zhang, L.; Tsai, M. D.; Freitas, M. A. J. Proteome Res. 2010, 9, 3384–3393. 4424
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry (131) Panchaud, A.; Singh, P.; Shaffer, S. A.; Goodlett, D. R. J. Proteome Res. 2010, 9, 2508–2515. (132) McIlwain, S.; Draghicescu, P.; Singh, P.; Goodlett, D. R.; Noble, W. S. J. Proteome Res. 2010, 9, 2488–2495. (133) Zhang, Z. Anal. Chem. 2010, 82, 1990–2005. (134) Kandasamy, K.; Pandey, A.; Molina, H. Anal. Chem. 2009, 81, 7170–7180. (135) Good, D. M.; Wenger, C. D.; Coon, J. J. Proteomics 2010, 10, 164–167. (136) Sadygov, R. G.; Good, D. M.; Swaney, D. L.; Coon, J. J. J. Proteome Res. 2009, 8, 3198–3205. (137) Datta, R.; Bern, M. J. Comput. Biol. 2009, 16, 1169–1182. (138) Sharma, V.; Eng, J. K.; Feldman, S.; von Haller, P. D.; MacCoss, M. J.; Noble, W. S. J. Proteome Res. 2010, 9, 5438–5444. (139) Kim, S.; Mischerikow, N.; Bandeira, N.; Navarro, J. D.; Wich, L.; Mohammed, S.; Heck, A. J.; Pevzner, P. A. Mol. Cell. Proteomics 2010, 9, 2840–2852. (140) Liu, X.; Shan, B.; Xin, L.; Ma, B. BMC Bioinf. 2010, 11 (Suppl 1), S4. (141) Chalkley, R. J.; Medzihradszky, K. F.; Lynn, A. J.; Baker, P. R.; Burlingame, A. L. Anal. Chem. 2010, 82, 579–584. (142) Bailey, C. M.; Sweet, S. M.; Cunningham, D. L.; Zeller, M.; Heath, J. K.; Cooper, H. J. J. Proteome Res. 2009, 8, 1965–1971. (143) Houel, S.; Abernathy, R.; Renganathan, K.; Meyer-Arendt, K.; Ahn, N. G.; Old, W. M. J. Proteome Res. 2010, 9, 4152–4160. (144) Bern, M.; Finney, G.; Hoopmann, M. R.; Merrihew, G.; Toth, M. J.; MacCoss, M. J. Anal. Chem. 2010, 82, 833–841. (145) Singh, S.; Springer, M.; Steen, J.; Kirschner, M. W.; Steen, H. J. Proteome Res. 2009, 8, 2201–2210. (146) Brunner, A.; Keidel, E. M.; Dosch, D.; Kellermann, J.; Lottspeich, F. Proteomics 2010, 10, 315–326. (147) Tsou, C. C.; Tsui, Y. H.; Yian, Y. H.; Chen, Y. J.; Yang, H. Y.; Yu, C. Y.; Lynn, K. S.; Sung, T. Y.; Hsu, W. L. Nucleic Acids Res. 2009, 37, W661–W669. (148) Cox, J.; Matic, I.; Hilger, M.; Nagaraj, N.; Selbach, M.; Olsen, J. V.; Mann, M. Nat. Protoc. 2009, 4, 698–705. (149) Griffin, N. M.; Yu, J.; Long, F.; Oh, P.; Shore, S.; Li, Y.; Koziol, J. A.; Schnitzer, J. E. Nat. Biotechnol. 2010, 28, 83–89. (150) Sardiu, M. E.; Washburn, M. P. Nat. Biotechnol. 2010, 28, 40–42. (151) Gnad, F.; de Godoy, L. M.; Cox, J.; Neuhauser, N.; Ren, S.; Olsen, J. V.; Mann, M. Proteomics 2009, 9 (20), 4642–4652. (152) Eisenacher, M.; Martens, L.; Hardt, T.; Kohl, M.; Barsnes, H.; Helsens, K.; Hakkinen, J.; Levander, F.; Aebersold, R.; Vandekerckhove, J.; Dunn, M. J.; Lisacek, F.; Siepen, J. A.; Hubbard, S. J.; Binz, P. A.; Bluggel, M.; Thiele, H.; Cottrell, J.; Meyer, H. E.; Apweiler, R.; Stephan, C. Proteomics 2009, 9, 3928–3933. (153) Barsnes, H.; Vizcaino, J. A.; Eidhammer, I.; Martens, L. Nat. Biotechnol. 2009, 27, 598–599. (154) Bell, A. W.; Deutsch, E. W.; Au, C. E.; Kearney, R. E.; Beavis, R.; Sechi, S.; Nilsson, T.; Bergeron, J. J. Nat. Methods 2009, 6, 423– 430. (155) Savitski, M. M.; Fischer, F.; Mathieson, T.; Sweetman, G.; Lang, M.; Bantscheff, M. J. Am. Soc. Mass Spectrom. 2010, 21, 1668– 1679. (156) Yan, W.; Luo, J.; Robinson, M.; Eng, J.; Aebersold, R. H.; Ranish, J. Mol. Cell. Proteomics 2011, 10 (3), M110 005611. (157) Chen, M.; Yang, B.; Ying, W.; He, F.; Qian, X. Protein Pept. Lett. 2010, 17, 277–286. (158) Durbin, R. M.; Abecasis, G. R.; Altshuler, D. L.; Auton, A.; Brooks, L. D.; Gibbs, R. A.; Hurles, M. E.; McVean, G. A. Nature 2010, 467, 1061–1073. (159) (a) Wang, J.; Perez-Santiago, J.; Katz, J. E.; Mallick, P.; Bandeira, N. Mol. Cell. Proteomics 2010, 9 (7), 1476–1485. (b) Cox, J.; Neuhauser, N.; Michalski, A.; Scheltema, R. A.; Olsen, J. V.; Mann, M. J. Proteome Res. 2011, 10, 1794–1805. (160) Dasari, S.; Chambers, M. C.; Slebos, R. J.; Zimmerman, L. J.; Ham, A. J.; Tabb, D. L. J. Proteome Res. 2010, 9, 1716–1726.
REVIEW
(161) Na, S.; Paek, E. J. Proteome Res. 2009, 8, 4418–4427. (162) Baliban, R. C.; DiMaggio, P. A.; Plazas-Mayorca, M. D.; Young, N. L.; Garcia, B. A.; Floudas, C. A. Mol. Cell. Proteomics 2010, 9, 764–779. (163) Cao, J.; Shen, C.; Wang, H.; Shen, H.; Chen, Y.; Nie, A.; Yan, G.; Lu, H.; Liu, Y.; Yang, P. J. Proteome Res. 2009, 8, 662–672. (164) Zielinska, D. F.; Gnad, F.; Wisniewski, J. R.; Mann, M. Cell 2010, 141, 897–907. (165) Gu, B.; Zhang, J.; Wang, W.; Mo, L.; Zhou, Y.; Chen, L.; Liu, Y.; Zhang, M. PLoS One 2010, 5, e15795. (166) Li, Q. R.; Xing, X. B.; Chen, T. T.; Li, R. X.; Dai, J.; Sheng, Q. H.; Xin, S. M.; Zhu, L. L.; Jin, Y.; Pei, G.; Kang, J. H.; Li, Y. X.; Zeng, R. Mol. Cell. Proteomics 2011, 10 (4), M110.001750. (167) Tian, R.; Wang, S.; Elisma, F.; Li, L.; Zhou, H.; Wang, L.; Figeys, D. Mol. Cell. Proteomics 2011, 10 (2), M110 000679. (168) Riaz, S.; Alam, S. S.; Akhtar, M. W. J. Pharm. Biomed. Anal. 2010, 51, 1103–1107. (169) Haubitz, M.; Good, D. M.; Woywodt, A.; Haller, H.; Rupprecht, H.; Theodorescu, D.; Dakna, M.; Coon, J. J.; Mischak, H. Mol. Cell. Proteomics 2009, 8, 2296–2307. (170) Blennow, K.; Hampel, H.; Weiner, M.; Zetterberg, H. Nat. Rev. Neurol. 2010, 6, 131–144. (171) Wei, W. B.; Martin, A.; Johnson, P. J.; Ward, D. G. Curr. Proteomics 2010, 7, 15–25. (172) Carboni, L.; Becchi, S.; Piubelli, C.; Mallei, A.; Giambelli, R.; Razzoli, M.; Mathe, A. A.; Popoli, M.; Domenici, E. Prog. Neuropsychopharmacol. Biol. Psychiatry 2010, 34, 1037–1048. (173) Cai, X. W.; Shedden, K.; Ao, X.; Davis, M.; Fu, X. L.; Lawrence, T. S.; Lubman, D. M.; Kong, F. M. Int. J. Radiat. Oncol. Biol. Phys. 2010, 77, 867–876. (174) Pendyala, G.; Fox, H. S. Genome Med. 2010, 2, 22. (175) Zhang, J.; Zhang, Y.; Li, N.; Liu, Z.; Xiong, C.; Ni, X.; Pu, Y.; Hui, R.; He, J.; Pu, J. Respir. Med. 2009, 103, 1801–1806. (176) Hu, W. T.; Chen-Plotkin, A.; Arnold, S. E.; Grossman, M.; Clark, C. M.; Shaw, L. M.; Pickering, E.; Kuhn, M.; Chen, Y.; McCluskey, L.; Elman, L.; Karlawish, J.; Hurtig, H. I.; Siderowf, A.; Lee, V. M.; Soares, H.; Trojanowski, J. Q. Acta Neuropathol. 2010, 119, 669–678. (177) (a) Whiteaker, J. R.; Zhao, L.; Anderson, L.; Paulovich, A. G. Mol. Cell. Proteomics 2010, 9, 184–196. (b) Keshishian, H.; Addona, T.; Burgess, M.; Mani, D. R.; Shi, X.; Kuhn, E.; Sabatine, M. S.; Gerszten, R. E.; Carr, S. A. Mol. Cell. Proteomics 2009, 8, 2339–2349. (c) Rodland, K. D. Dis. Markers 2010, 28, 195–197. (178) Li, L. P.; Lu, C. H.; Chen, Z. P.; Ge, F.; Wang, T.; Wang, W.; Xiao, C. L.; Yin, X. F.; Liu, L.; He, J. X.; He, Q. Y. Proteomics 2011, 11, 429–439. (179) Schiess, R.; Wollscheid, B.; Aebersold, R. Mol. Oncol. 2009, 3, 33–44. (180) Young, N. L.; DiMaggio, P. A.; Plazas-Mayorca, M. D.; Baliban, R. C.; Floudas, C. A.; Garcia, B. A. Mol. Cell. Proteomics 2009, 8, 2266–2284. (181) Jung, H. R.; Pasini, D.; Helin, K.; Jensen, O. N. Mol. Cell. Proteomics 2010, 9, 838–850. (182) Geiger, T.; Cox, J.; Mann, M. PLoS Genet. 2010, 6 (9), e1001090. (183) Lambert, J. P.; Mitchell, L.; Rudner, A.; Baetz, K.; Figeys, D. Mol. Cell. Proteomics: MCP 2009, 8, 870–882. (184) Mittler, G.; Butter, F.; Mann, M. Genome Res. 2009, 19, 284– 293. (185) Butter, F.; Scheibe, M.; Morl, M.; Mann, M. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 10626–10631. (186) Rix, U.; Superti-Furga, G. Nat. Chem. Biol. 2009, 5, 616–624. (187) Wang, Z.; Udeshi, N. D.; O’Malley, M.; Shabanowitz, J.; Hunt, D. F.; Hart, G. W. Mol. Cell. Proteomics 2010, 9, 153–160. (188) Martin, B. R.; Cravatt, B. F. Nat. Methods 2009, 6, 135–138. (189) Nomura, D. K.; Long, J. Z.; Niessen, S.; Hoover, H. S.; Ng, S. W.; Cravatt, B. F. Cell 2010, 140, 49–61. (190) Weerapana, E.; Wang, C.; Simon, G. M.; Richter, F.; Khare, S.; Dillon, M. B.; Bachovchin, D. A.; Mowen, K.; Baker, D.; Cravatt, B. F. Nature 2010, 468, 790–795. 4425
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426
Analytical Chemistry (191) Nguyen, U. T.; Guo, Z.; Delon, C.; Wu, Y.; Deraeve, C.; Franzel, B.; Bon, R. S.; Blankenfeldt, W.; Goody, R. S.; Waldmann, H.; Wolters, D.; Alexandrov, K. Nat. Chem. Biol. 2009, 5, 227–235. (192) Charron, G.; Zhang, M. M.; Yount, J. S.; Wilson, J.; Raghavan, A. S.; Shamir, E.; Hang, H. C. J. Am. Chem. Soc. 2009, 131, 4967–4975. (193) Fleischer, T. C.; Murphy, B. R.; Flick, J. S.; Terry-Lorenzo, R. T.; Gao, Z. H.; Davis, T.; McKinnon, R.; Ostanin, K.; Willardsen, J. A.; Boniface, J. J. Chem. Biol. 2010, 17, 659–664. (194) (a) Wierzba, K.; Muroi, M.; Osada, H. Curr. Opin. Chem. Biol. 2011, 15, 57–65. (b) Miyazaki, I.; Simizu, S.; Okumura, H.; Takagi, S.; Osada, H. Nat. Chem. Biol. 2010, 6, 667–673. (195) Huber, C.; Huber, L. Proteomics 2010, 10, 3564–3565. (196) Armirotti, A.; Damonte, G. Proteomics 2010, 10, 3566– 3576. (197) Wynne, C.; Fenselau, C.; Demirev, P. A.; Edwards, N. Anal. Chem. 2009, 81, 9633–9642. (198) Pan, J.; Han, J.; Borchers, C. H.; Konermann, L. Anal. Chem. 2010, 82, 8591–8597. (199) (a) Ryan, C. M.; Souda, P.; Bassilian, S.; Ujwal, R.; Zhang, J.; Abramson, J.; Ping, P.; Durazo, A.; Bowie, J. U.; Hasan, S. S.; Baniulis, D.; Cramer, W. A.; Faull, K. F.; Whitelegge, J. P. Mol. Cell. Proteomics 2010, 9, 791–803. (b) Ge, Y.; Rybakova, I. N.; Xu, Q.; Moss, R. L. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 12658–12663. (c) Han, J.; Borchers, C. H. Proteomics 2010, 10, 3621–3630. (200) Guda, C.; King, B. R.; Pal, L. R.; Guda, P. PLoS One 2009, 4, e5096. (201) (a) Mazur, M. T.; Cardasis, H. L.; Spellman, D. S.; Liaw, A.; Yates, N. A.; Hendrickson, R. C. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 7728–7733. (b) Collier, T. S.; Sarkar, P.; Rao, B.; Muddiman, D. C. J. Am. Soc. Mass Spectrom. 2010, 21, 879–889. (202) Mohr, J.; Swart, R.; Samonig, M.; Bohm, G.; Huber, C. G. Proteomics 2010, 10, 3598–3609. (203) Madsen, J. A.; Gardner, M. W.; Smith, S. I.; Ledvina, A. R.; Coon, J. J.; Schwartz, J. C.; Stafford, G. C., Jr.; Brodbelt, J. S. Anal. Chem. 2009, 81, 8677–8686. (204) Coon, J. J. Anal. Chem. 2009, 81, 3208–3215. (205) Liu, X.; Inbar, Y.; Dorrestein, P. C.; Wynne, C.; Edwards, N.; Souda, P.; Whitelegge, J. P.; Bafna, V.; Pevzner, P. A. Mol. Cell. Proteomics 2010, 9, 2772–2782. (206) Pflieger, D.; Przybylski, C.; Gonnet, F.; Le Caer, J. P.; Lunardi, T.; Arlaud, G. J.; Daniel, R. Mol. Cell. Proteomics 2010, 9, 593–610. (207) (a) Lambert, J. P.; Ethier, M.; Smith, J. C.; Figeys, D. Anal. Chem. 2005, 77, 3771–3787. (b) Smith, J. C.; Lambert, J. P.; Elisma, F.; Figeys, D. Anal. Chem. 2007, 79, 4325–4343. (c) Abu-Farha, M.; Elisma, F.; Zhou, H.; Tian, R.; Asmer, M. S.; Figeys, D. Anal. Chem. 2009, 81, 4585–4599. (208) Gallego, O.; Betts, M. J.; Gvozdenovic-Jeremic, J.; Maeda, K.; Matetzki, C.; Aguilar-Gurrieri, C.; Beltran-Alvarez, P.; Bonn, S.; Fernandez-Tornero, C.; Jensen, L. J.; Kuhn, M.; Trott, J.; Rybin, V.; Muller, C. W.; Bork, P.; Kaksonen, M.; Russell, R. B.; Gavin, A. C. Mol. Syst. Biol. 2010, 6, 430. (209) Li, X.; Gianoulis, T. A.; Yip, K. Y.; Gerstein, M.; Snyder, M. Cell 2010, 143, 639–650. (210) Sardiu, M. E.; Cai, Y.; Jin, J.; Swanson, S. K.; Conaway, R. C.; Conaway, J. W.; Florens, L.; Washburn, M. P. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 1454–1459. (211) Dreze, M.; Monachello, D.; Lurin, C.; Cusick, M. E.; Hill, D. E.; Vidal, M.; Braun, P. Methods Enzymol. 2010, 470, 281–315. (212) (a) Ochs, M. F. Briefings Bioinf. 2010, 11, 30–39. (b) Koyuturk, M. Wiley Interdiscip. Rev. Syst. Biol. Med. 2010, 2, 277–292. (213) Lambert, J. P.; Fillingham, J.; Siahbazi, M.; Greenblatt, J.; Baetz, K.; Figeys, D. Mol. Syst. biol. 2010, 6, 448. (214) Breitkreutz, A.; Choi, H.; Sharom, J. R.; Boucher, L.; Neduva, V.; Larsen, B.; Lin, Z. Y.; Breitkreutz, B. J.; Stark, C.; Liu, G.; Ahn, J.; DewarDarch, D.; Reguly, T.; Tang, X.; Almeida, R.; Qin, Z. S.; Pawson, T.; Gingras, A. C.; Nesvizhskii, A. I.; Tyers, M. Science 2010, 328, 1043–1046. (215) Mak, A. B.; Ni, Z.; Hewel, J. A.; Chen, G. I.; Zhong, G.; Karamboulas, K.; Blakely, K.; Smiley, S.; Marcon, E.; Roudeva, D.; Li, J.;
REVIEW
Olsen, J. B.; Wan, C.; Punna, T.; Isserlin, R.; Chetyrkin, S.; Gingras, A. C.; Emili, A.; Greenblatt, J.; Moffat, J. Mol. Cell. Proteomics: MCP 2010, 9, 811–823. (216) Takahashi, K.; Tanabe, K.; Ohnuki, M.; Narita, M.; Ichisaka, T.; Tomoda, K.; Yamanaka, S. Cell 2007, 131, 861–872. (217) Baker, C. L.; Kettenbach, A. N.; Loros, J. J.; Gerber, S. A.; Dunlap, J. C. Mol. Cell 2009, 34, 354–363. (218) Xu, P.; Duong, D. M.; Seyfried, N. T.; Cheng, D.; Xie, Y.; Robert, J.; Rush, J.; Hochstrasser, M.; Finley, D.; Peng, J. Cell 2009, 137, 133–145. (219) Du, Y.; Zhou, J.; Fan, J.; Shen, Z.; Chen, X. J. Proteome Res. 2009, 8, 2211–2217. (220) Sardiu, M. E.; Gilmore, J. M.; Carrozza, M. J.; Li, B.; Workman, J. L.; Florens, L.; Washburn, M. P. PloS One 2009, 4, e7310. (221) Mallick, P.; Kuster, B. Nat. Biotechnol. 2010, 28, 695–709.
4426
dx.doi.org/10.1021/ac200857t |Anal. Chem. 2011, 83, 4407–4426