Advancements in Top-Down Proteomics - Analytical Chemistry (ACS

Zhibin Ning obtained his B.S. degree in life science at Shandong Normal University, China in 2003. ... Mohamed Abu-Farha completed his Honors B.S. deg...
0 downloads 7 Views 620KB Size
REVIEW pubs.acs.org/ac

Advancements in Top-Down Proteomics Hu Zhou,†,‡ Zhibing Ning,† Amanda E. Starr,† Mohamed Abu-Farha,§ and Daniel Figeys*,† †

Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, K1H8M5 ‡ Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China 201203 § Biochemistry and Molecular Biology Unit, Dasman Diabetes Institute, Dasman 15462, Kuwait

’ CONTENTS Top-Down Proteomics: Technology Requirement Top-Down Proteomics for Mapping Isoforms and PTM Isoforms Detailed Mapping of PTM Top-Down Proteomics to Identify Proteolysis Top-Down Proteomics to Study Protein Complexes Top-Down Proteomics to Study Complex Samples Prokaryotic Sample Eukaryotic Sample Middle-Down: A Compromise between Bottom-Up and Top-Down Proteomics Bioinformatics for Top-Down Proteomics Conclusions Author Information Biographies Acknowledgment References

Multiple techniques were initially developed for the large-scale “bottom-up” identification of proteins isolated by 2D gel electrophoresis4 and now gel free approaches.5 Moreover, more refined approaches were developed to not only perform large-scale identification of proteins but also to provide relative quantitation of the levels of the proteins in different samples.611 Similar approaches were also developed for large-scale identification of some post-translational modifications (PTM).1214 Bottom-up proteomics as we know it today can identify and quantify the changes in thousands of proteins and some PTMs across multiple samples. However, large-scale proteomics is far from exhaustive and it relies on a few peptide matches per protein for identification/ quantitation. This means that most of the subtleties of individual proteins are easily missed including mutations/splice variants and proteolytic cleavage. Cleavage and splice-variant patterns that are readily apparent on SDS-PAGE are difficult to sort-out in “bottom-up” proteomic approaches even for relatively simple samples. Although, the identification of PTMs by bottom-up mass spectrometry has greatly improved with routine reports of thousands of PTM identified, in general it provides a poor coverage of the PTMs present on individual proteins. Clearly, bottom-up proteomics is a tremendous discovery tool that can pinpoint proteins responding to different biological states. However, alternative approaches are needed for the detailed characterization of these proteins to better understand the link between the observed changes and the biological states. These links usually become clearer when studying the regulation of a protein function/activity achieved through changes in levels, PTMs, balances in isoforms, cleavage of the proteins, and relocalization. In many instances, different isoforms are present and can lead to different activities/functions. For example, mutation within the protein sequence of the secreted PCSK9 protein can lead to hypercholesterolemia.15 To date, over 150 mutations have been reported for this single protein alone including gain and loss of function variants. These mutations affect the secretion, cleavage, PTMs, and activity of PCSK9. Therefore, knowing the circulating PCSK9 variants is important for understanding its effects on cholesterol regulation. Moreover, regulation is often not achieved by a single PTM but rather by a complex interplay of different PTMs. For example, histone acetylation, methylation,

721 723 723 724 725 727 728 728 728 729 730 730 730 730 731 731

T

he study of proteins is one of the most challenging tasks in analytical chemistry. The initial protein research focused on developing techniques that could separate and identify the primary sequence (or parts) of a protein. The Edman degradation1 of peptides was a revolutionary approach for identifying the primary amino acid sequence of a peptide. Moreover, even the whole sequence of a protein could be determined by combining protein digestion with different enzymes and HPLC fractionation of peptides followed by Edman degradation. Then in the 1990s, Edman degradation was displaced by mass spectrometry approaches for protein identification.2 The ease of protein identification by mass spectrometry made it feasible to develop approaches to study the ensemble of the proteins in a sample (proteome3). Over the years, different techniques have been developed to study proteins; this expanding field of research is referred to as “proteomics”. Although the term proteomics was initially intended to represent the large-scale study of the proteome, it now refers to small-to-large scale studies of proteins. Also, the fields of proteomics that rely on mass spectrometry are subdivided according to different criteria. One criterion is the analysis of proteins with limited processing (top-down) or the analysis of proteins using their proteolytic peptides as proxy for the protein (bottom-up). r 2011 American Chemical Society

Special Issue: Fundamental and Applied Reviews in Analytical Chemistry Published: November 02, 2011 720

dx.doi.org/10.1021/ac202882y | Anal. Chem. 2012, 84, 720–734

Analytical Chemistry

REVIEW

Figure 1. Technological requirements for top-down proteomics. Intact proteins can be extracted from biological samples (cells, tissue, biofluid, etc.) and separated using classical biochemical separation, liquid chromatography, or electrophoresis. The resulting less complex proteins can be introduced to the mass spectrometers by electrospray (ESI) or matrix-assisted laser desorption/ionization (MALDI). As well, the gas-phase protein ions can be fragmented using electron capture dissociation (ECD), collision induced dissociation (CID), higherenergy collision dissociation (HCD), or electron transfer dissociation (ETD) to obtain sequence information. Different types of mass spectrometers can be employed for top-down proteomic experiments, including Fourier trandform ion cyclotron resonance (FTICR), time-of-flight (TOF), ion trap/linear ion trap, ion mobility mass spectrometry, quadrupole time-of-flight (QTOF), TripleTOF, and LTQ-Orbitrap.

phosphorylation, and ubiquitination constitute the “histone code” which is an important interplay of PTMs involved in the regulation of gene expressions.16 The cleavage of a protein can also lead to different active/inactive products. For example, the cleavage of the β-amyloid precursor protein (APP) leads to aβ formation.17 As well, the activity of a protein is often dependent on its interactions partners. For example, SMYD2, a histone methyl transferase, requires its chaperone HSP90α to methylate histone 3 at K4.18 Moreover, the concentration and localization of proteins is not static, but instead it responds dynamically to the change in the environment though gene expression regulation, protein degradation, protein secretion, and protein relocalization. Clearly, a detailed analysis of the primary sequences of proteins as well as their PTM, processing, concentration, and localization requires alternatives to the bottom-up proteomics approaches. One such alternative is termed top-down proteomics, which focuses on characterizing intact proteins instead of using derived peptides as proxy for the proteins. So far, it is a low-throughput approach (akin to classical protein biochemistry) that provides much more information on fewer proteins in terms of their isoforms, processing, PTM, and interactions. The term “topdown proteomics” has been attributed to McLafferty.19 In our previous series of biannual reviews in Analytical Chemistry, which started in 2000, we provided a survey of the developments in proteomics. In our next series, we focus on more detailed reviews of specific proteomics subfields. We believe that although top-down and bottom-up proteomics have similar technological

aspects, the biological applications of bottom-up and top-down proteins are rather different. Here we have selected to review the development of top-down proteomics and its possible applications in biology in contrast to bottom-up proteomics and other techniques.

’ TOP-DOWN PROTEOMICS: TECHNOLOGY REQUIREMENT One of the challenges in top-down proteomics is to enrich and fractionate proteins to a level suitable for analysis by mass spectrometry (Figure 1). Often proteins need to be isolated from cells, tissues, and biological fluids against a background of thousands of other proteins that cover a wide range of concentrations. When feasible, classical biochemical separation methods, such as subcellular fractionation and immunoprecipitation, are applied for the enrichment of proteins.2024 It is important to keep in mind that immunopurification does not necessarily purify all isoforms of a protein, particularly when mutations affect the epitope or cleavage of the protein. Alternatively, intact proteins can be separated by liquid chromatography (LC) and electrophoresis.25 The current liquid chromatography methods for protein separation are based on protein hydrophobicity (reversed phase, hydrophobic interaction,2628 normal phase, and hydrophilic interaction (HILIC)),29 protein charge (chromatofocusing, ion exchange, and mixed mode), protein size (size exclusion),30,31 and protein-specific characteristics 721

dx.doi.org/10.1021/ac202882y |Anal. Chem. 2012, 84, 720–734

Analytical Chemistry

REVIEW

(immobilized metal affinity,3234 liquidliquid partition/ extraction,3537 etc.). Intact protein separation can also be done by various electrophoretic techniques, for example, isoelectrofocusing (IEF),3842 free flow electrophoresis,43,44 gel electrophoresis, capillary electrophoresis,45,46 and capillary electrochromatography.47 Although many protein separation techniques have been developed, most do not have the separation power to purify single proteins from complex mixtures. Instead, top-down proteomic strategies have been developed to characterize multiple intact proteins from mixtures. Because of its compatibility to the mass spectrometer, reversed phase liquid chromatography (LC) is the dominant online separation method for intact protein analysis.25 Shorter alkyl groups of reverse stationary phases, like C4, can also be used for intact protein separation because of their higher protein recoveries.25 Vellaichamy et al.30 used reversed phase nanocapillary HPLC with a polymeric stationary phase to obtain better performances in terms of mass range (up to 80 kDa) and number of proteins detected as compared with a silica phase. Eeltink et al.48 designed a 250 mm  0.2 mm poly(styrene-codivinylbenzene) monolithic capillary column for high-sensitivity protein separations, yielding peak capacities >600 within a 2 h linear gradient. Kim et al.49 developed a chip-type design with an asymmetrical field-flow fractionation (AF4) channel. This device was used to perform high-speed separation of proteins prior to top-down proteomic analysis using online coupled electrospray ionization mass spectrometry (ESI-MS). The combination of multiple protein separation techniques can achieve better resolution for intact protein separation. For example, Puchades et al.50 used solution-phase isoelectric focusing (sIEF) in a Rotofor cell coupled with sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) to isolate intact brain-specific proteins from cerebrospinal fluid. Many researchers have also used solution-phase isoelectric focusing (sIEF) for protein separation, and the proteins were resolved in a specific pI range.5155 In another example, Tran et al.56,57 used gel-eluted liquid fraction entrapment electrophoresis (GELFrEE) for the separation of proteins, achieving an extended mass range from 10 to 250 kDa. As well, using GELFrEE and online nano reversed-phase LC for intact protein separation from yeast and human samples, they readily identified 1060 proteins per HPLC fraction with a molecular weight up to 7080 kDa. Another combination of techniques was reported by Geng et al. who used hydrophobic interaction chromatography (HIC) and weak-cation exchange chromatography (WCX) in a single column for 2D-LC separation of proteins.58 This single HIC-WCX column can achieve similar resolution and better selectivity compared to commercially available columns and has a hypothetical peak capacity of 329.58 Sharma et al.59 used a combination of weak anion exchange (WAX) fractionation and online reversed-phase liquid chromatography (RPLC) separation and a 12 T Fourier transform ion cyclotron resonance (FTICR) mass spectrometer to analyze intact proteins from a Shewanella oneidensis MR-1 cell lysate, detecting 715 intact proteins. Improvements in the field of mass spectrometry have also contributed to top-down proteomics. Two types of “soft” ionization techniques have been employed to ionize and transfer the intact protein into the mass spectrometry. These techniques are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). Although no novel ionization techniques are required for top-down proteomics, some improvements to protein ionization have been reported. In

particular, systematically increasing the analyte charge lowers the m/z and enhances the sensitivity. This can be achieved by adding small amounts of low-volatility “supercharging” reagents to the ESI solutions. These reagents can significantly increase ion charge states and lower the mass-to-charge ratio of the proteins to a range readily observable by most mass analyzers (m/z = 4002000).6064 Furthermore, ions with higher charges in Fourier transform mass analyzers (Orbitrap and ICR) improve the signal-to-noise ratio, mass resolution, and mass accuracy.65 For example, m-nitrobenzyl alcohol (m-NBA) or sulfolane are the commonly used “supercharging” reagents. Valejar et al.65 discovered that new “supercharging” reagents such as dimethylformamide (DMF), thiodiglycol, dimethylacetamide, dimethylsulfoxide, and N-methylpyrrolidone can significantly increase the charge states of proteins up to 78 kDa. For some proteins, labile groups or partners can be lost during the electrospray process. Chen et al.66 developed a high pressure electrospray ionization mass spectrometer that can operate in pressure higher than 1 atm. Interestingly, optimizing the operating pressure of the ion source can reduce fragmentation of labile compounds during the ionization process. For example, myoglobin contains a noncovalently bound iron-containing porphyrin (heme), which can be easily detached due to the excessive energy deposited to the protein during the desolvation process, resulting in myoglobin thermally labile. When the operating pressure of the ion source was equal to or greater than 4 bar, the obtained mass spectra were dominated by the heme bound holomyoglobin due to the gentle desolvation in higher pressure.66 Sampson et al.67 demonstrated the implementation of an infrared laser onto a matrix-assisted laser desorption electrospray ionization (MALDESI) source with ESI postionization. This technique yielded multiply charged peptides and proteins and was demonstrated for intact and top-down analyses of equine myoglobin (17 kDa) by FTICR mass spectrometry. Takats et al.68 demonstrated desorption electrospray ionization (DESI) by directing electrosprayed charged droplets and ions of solvent onto the sample surface, to perform ambient ionization for a trace sample at atmospheric pressure, with little sample preparation. Ferguson et al.69 added supercharging reagents to the DESI spray solvent, which can increase the charges on protein complexes without dissociation. They demonstrated the analysis of proteins that have masses as high as 150 kDa, including immunoglobulin G (IgG). Improvements in the mass analyzer have also benefited topdown proteomics. The FTICR mass spectrometer is currently the highest resolution instrument commercially available for topdown proteomics. Unfortunately, issues of robustness and ease of use are limiting the applications of FTICR to mass spectrometry laboratories, and very few FTICR instruments are in place in biological laboratories. Interestingly, new commercial hybrid mass spectrometers were developed which provide sufficient resolution and high mass accuracy and are relatively easy to use; these hybrid MS can also be used for top-down proteomics. For example, the Triple-TOF, a hybrid quadrupole time-of-flight mass spectrometer can provide high resolving power (30k) and mass accuracy (