Anal. Chem. 2007, 79, 1002-1009
Membrane Proteome Analysis of Microdissected Ovarian Tumor Tissues Using Capillary Isoelectric Focusing/Reversed-Phase Liquid Chromatography-Tandem MS Weijie Wang,† Tong Guo,‡ Paul A. Rudnick,† Tao Song,† Jie Li,§ Zhengping Zhuang,§ Wenxin Zheng,⊥ Don L. DeVoe,| Cheng S. Lee,‡ and Brian M. Balgley*,†
Calibrant Biosystems, 910 Clopper Road, Suite 220N, Gaithersburg, Maryland 20878, Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, Molecular Pathogenesis Unit, Surgical Neurology Branch, National Institute of Neurological Disorders and Stroke, Bethesda, Maryland 20892, Department of Pathology, University of Arizona, Tucson, Arizona 85724, and Department of Mechanical Engineering and Bioengineering Program, University of Maryland, College Park, Maryland 20742
This work expands our tissue proteome capabilities from the analysis of soluble proteins in previous studies to the examination of membrane proteins within the pellets of enriched and selectively isolated tumor cells procured from microdissected tissue specimens. The pellets of targeted ovarian tumor cells are treated by two different membrane protein extraction methods, including the use of detergent and organic solvent. The detergent-based membrane protein preparation protocol not only extracts proteins effectively from cell pellets but also is compatible with subsequent proteome analysis using combined capillary isoelctric focusing/nano reversed-phase liquid chromatography separations coupled with nano electrospray ionization mass spectrometry. Among proteins identified from an amount of pellet equivalent to 20 000 cells, 773 proteins are predicted to contain one or more transmembrane domains, corresponding to 22% membrane proteome coverage within the SwissProt Human protein sequence entries. Although proteomic technologies have made significant progress in the analysis of soluble proteins in recent years, membrane proteins have lagged behind and are typically under-represented in datasets. Even though contemporary genomic analyses indicate that 20-30% of all open reading frames encode for integral membrane proteins,1the representation of membrane proteins reported in existing analyses is much lower.2 However, the importance of membrane proteins in drug discovery cannot be overemphasized: membrane proteins currently account for ∼70% * To whom all correspondence should be addressed. Phone: (301) 977-7900, ext. 14. Fax: (301) 977-7981. E-mail:
[email protected]. † Calibrant Biosystems. ‡ Department of Chemistry and Biochemistry, University of Maryland. § National Institute of Neurological Disorders and Stroke. ⊥ University of Arizona. | Department of Mechanical Engineering and Bioengineering Program, University of Maryland. (1) Wallin, E.; Von-Heijne, G. Protein Sci. 1998, 7, 1029-1038. (2) Wu, C. C.; Yates, J. R., III. Nat. Biotechnol. 2003, 21, 262-267.
1002 Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
of all known pharmaceutical drug targets.3 Furthermore, there is growing interest in the use of therapeutic antibodies to target cell surface proteins uniquely expressed in diseased cells or tissues. For example, assay of HER-2/neu is now mandatory in deciding which patients with metastatic breast cancer should receive treatment with the antibody Herceptin.4 Additional targets may also be identified by screening for cell surface proteins whose expression is selectively induced in tumor cells exposed to chemotherapy. Initial results have suggested that immunotherapeutic targeting of antigens induced by chemotherapy provides a positive therapeutic index in a combined drug treatment 5 Even though two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) coupled with highly solubilizing reagents has been employed for the identification of membrane proteins,6 most recent accomplishments in membrane proteomics have been achieved using shotgun-based chromatography/MS approaches.7-13 This is because the first separation dimension of 2D PAGE, isoelectric focusing of proteins, is inherently difficult with membrane proteins owing to their hydrophobic nature and their tendency to precipi(3) Hopkins, A. L.; Groom, C. R. Nat. Rev. Drug Discovery 2002, 1, 727-730. (4) Bast, R. C.; Radvin, P.; Hayes, D. F.; Bates, S.; Fritsche, H.; Jessup, J. M.; Kemeny, N.; Locker, G. Y.; Mennel, R. G.; Somerfield, M. R. J. Clin. Oncol. 2001, 19, 1865-1878. (5) Rubinfeld, B.; Upadhyay, A.; Clark, S. L.; Fong, S. E.; Smith, V.; Koeppen, H.; Ross, S.; Polakis, P. Nat. Biotechnol. 2006, 24, 205-209. (6) Pedersen, S. K.; Harry, J. L.; Sebastian, L.; Baker, J.; Traini, M. D.; McCarthy, J. T.; Manoharan, A.; Wilkins, M. R.; Gooley, A. A.; Righetti, P. G.; Packer, N. H.; Williams, K. L.; Herbert, B. R. J. Proteome Res. 2003, 2, 303-311. (7) Wolters, D. A.; Washburn, M. P.; Yates, J. R., III. Anal. Chem. 2001, 73, 5683-5690. (8) Washburn, M. P.; Wolters, D.; Yates, J. R., III. Nat. Biotechnol. 2001, 19, 242-247. (9) Schirmer, E. C.; Florens, L.; Tinglu G.; Yates, J. R., III; Gerace, L. Science 2003, 301, 1380-1382. (10) Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R., III. Nat. Biotechnol. 2003, 21, 532-538. (11) Blonder, J.; Hale, M. L.; Lucas, D. A.; Schaefer, C. F.; Yu, L.-R.; Conrads, T. P.; Issaq, H. J.; Stiles, B. G.; Veenstra, T. D. Electrophoresis 2004, 25, 1307-1318. (12) Nielsen, P. A.; Olsen, J. V.; Podtelejnikov, A. V.; Andersen, J. R.; Mann, M.; Wisniewski, J. R. Mol. Cell. Proteomics 2005, 4, 402-408. (13) Wei, J.; Sun, J.; Yu, W.; Jones, A.; Oeller, P.; Keller, M.; Woodnutt, G.; Short, J. M. J. Proteome Res. 2005, 4, 801-808. 10.1021/ac061613i CCC: $37.00
© 2007 American Chemical Society Published on Web 12/23/2006
tate at their isoelectric point. Additionally, extraction of tryptic peptides resulting from in-gel digestion of a membrane protein from a 2D gel for MS-based identification is difficult due to the hydrophobic nature of the resulting peptides, resulting in low recovery and low sequence coverage. Three methods, including the use of detergents,13-16 organic solvents,11,17-19 and organic acids,7,8 are generally utilized to solubilize enriched membrane fractions for shotgun-based analyses. These solubilization methods are compatible with subsequent proteolytic digestion and chemical cleavage, and chromatography separation and MS analysis. Disease relevant tissue proteomic data can be generated only if the samples investigated consist of homogeneous cell populations, in which no unwanted cells of different types or developmental stages obscure the results. Our research efforts have, therefore, centered on the development and evaluation of a novel biomarker discovery paradigm based on performing comparative analysis of protein expression profiles within homogeneous subpopulations of microdissection-derived cells from normal and diseased tissue biopsies. However, current proteome separation platforms, including 2-D PAGE and multidimensional liquid chromatography system7,8 require large cellular samples that are generally incompatible with the protein quantities extracted from microdissection-procured tissue specimens. As demonstrated in our previous work,20-23 the key to enabling sensitive tissue proteome analysis is attributed to high analyte concentrations in small peak volumes prior to MS measurements as the result of electrokinetic focusing and high resolving power in the capillary isoelectric focusing (CIEF)-based multidimensional separation platform. In addition to the analysis of soluble proteins within microdissected tumor cells, the tissue proteome capabilities of combined CIEF/nano reversed-phase liquid chromatography (nanoRPLC) separations are further expanded for the identification of membrane proteins extracted from cell pellets using sodium dodecyl sulfate (SDS) detergent. We present a method for completely solubilizing the pellets of microdissected tissues, preparing the resulting solution to ensure compatibility with the downstream separations, and conducting the CIEF separation without the problems encountered in IEF separations of membrane proteins. Epithelial ovarian carcinoma tissue specimens containing the serous cell type, which is by far the most common ovarian cancer, are microdissected, processed, and employed for the membrane proteome analysis reported in this study. (14) Han, D. K.; Eng, J.; Zhou, H.; Aebersold, R. Nat. Biotechnol. 2001, 19, 946951. (15) Hixson, K. K.; Rodriguez, N.; Camp, D. G.; Strittmatter, E. F.; Lipton, M. S.; Smith, R. D. Electrophoresis 2002, 23, 3224-3232. (16) Ruth, M. C.; Old, W. M.; Emrick, M. A.; Meyer-Arendt, K.; Aveline-Wolf, L. D.; Pierce, K. G.; Mendoza, A. M.; Sevinsky, J. R.; Hamady, M.; Knight, R. D.; Resing, K. A.; Ahn, N. G. J. Proteome Res. 2006, 5, 709-719. (17) Blonder, J.; Goshe, M. B.; Moore, R. J.; Pasa-Tolic, L.; Masseion, C. D.; Lipton, M. S.; Smith, R. D. J. Proteome Res. 2002, 1, 351-360. (18) Goshe, M. B.; Blonder, J.; Smith, R. D. J. Proteome Res. 2003, 2, 153-161. (19) Wang, H.; Qian, W.-J.; Mottaz, H. M.; Clauss, T. R. W.; Anderson, D. J.; Moore, R. J.; Camp, D. G., II; Khan, A. H.; Sforza, D. M.; Pallavicini, M.; Smith, D. J.; Smith, R. D. J. Proteome Res. 2005, 4, 2397-2403. (20) Chen, J.; Balgley, B. M.; DeVoe, D. L.; Lee, C. S. Anal. Chem. 2003, 75, 3145-3152. (21) Wang, Y.; Balgley, B. M.; Rudnick, P. A.; Evans, E. L.; DeVoe, D. L.; Lee, C. S. J. Proteome Res. 2005, 4, 36-42. (22) Wang, Y.; Balgley, B. M.; Lee, C. S. Expert Rev. Proteomics 2005, 2, 659667.
EXPERIMENTAL SECTION Materials and Reagents. Fused-silica capillaries (50-µm i.d./ 375-µm o.d. and 100-µm i.d./375-µm o.d.) were acquired from Polymicro Technologies (Phoenix, AZ). Acetic acid, ammonium acetate, ammonium hydroxide, ampholyte 3-10, dithiothreitol (DTT), formic acid, and iodoacetamide (IAM) were obtained from Sigma (St. Louis, MO). Acetonitrile, hydroxypropyl cellulose (HPC, average MW 100 000), SDS, tris(hydroxymethyl)aminomethane (Tris), and urea were purchased from Fisher Scientific (Pittsburgh, PA). All chemicals used were A.C.S. grade or higher, and all solvents used were HPLC grade or higher. Sequencing grade trypsin was obtained from Promega (Madison, WI). All solutions were prepared using water purified by a Nanopure II system (Dubuque, IA) and further filtered with a 0.22-µm mixed cellulose esters membrane (Millipore, Billerica, MA). Tissue Microdissection and Protein Sample Preparation. Tumor tissues of epithelial ovarian carcinoma were collected at the time of surgery. These tissues were completely covered with a solution mixture of poly(ethylene glycol) and poly(vinyl alcohol) under the trade name of Optimal Cutting Temperature Medium (Tissue-Tek, Sakura, Finetek, Torrence, CA), immediately frozen in liquid nitrogen, and stored at -80 °C. Epithelial ovarian carcinoma cells have several morphological features that can be distinguished under the microscope. These features are used to classify epithelial ovarian carcinomas into clear, endometrioid, mucinous, serous cell types, and others. The samples of ovarian serous carcinoma, which were stained with hematoxylin and eosin, were histologically examined and confirmed by an experienced gynecologic pathologist on the basis of the World Health Organization classification, represented the pure form of serous cells without other mixed components. The tissue was microdissected by following the procedures described in our previous work24 to gather ∼100 000 tumor cells from each sample. The microdissected cells were placed directly into a microcentrifuge tube containing 8 M urea and 20 mM Tris-HCl at pH 8.0. The soluble proteins were collected in the supernatant by centrifugation at 20000g for 30 min at 4 °C. Protein concentration was measured using a mini-Bradford assay (Pierce, Rockford, IL) on a UV/vis spectrophotometer (Nanodrop Technologies, Wilmington, DE). The soluble fraction yielded ∼50 µg of protein from 100 000 cells. This yield is consistent with recently reported protein recoveries from microdissected tissue sections.25 Proteins in the supernatant were reduced and alkylated by sequentially adding DTT and IAM with final concentrations of 10 and 20 mg/mL, respectively. The solution was incubated at 37 °C for 1 h in the dark and then diluted 8-fold with 100 mM ammonium acetate at pH 8.0. Trypsin was added at a 1:40 (w/w) enzyme-to-substrate ratio, and the solution was incubated at 37 °C overnight. Tryptic digests were desalted using a Peptide MacroTrap column (Michrom Bioresources, Auburn, CA), lyophilized to dryness using a SpeedVac (Thermo, San Jose, CA), and then stored at -80 °C. (23) Wang, Y.; Rudnick, P. A.; Evans, E. L.; Zhuang, Z.; Li, J.; DeVoe, D. L.; Lee, C. S.; Balgley, B. M. Anal. Chem. 2005, 77, 6549-6556. (24) Furuta, M.; Weil, R. J.; Vortmeyer, A. O.; Huang, S.; Lei, J.; Huang, T.-N.; Lee, Y.-S.; Bhowmick, D. A.; Lubensky, I. A.; Oldfield, E. H.; Zhuang, Z. Oncogene 2004, 23, 6806-6814. (25) Rahimi, F.; Sheperd, C. E.; Halliday, G. M.; Gezcy, C. L.; Raftery, M. J. Anal. Chem. 2006, 78, 7216-7221.
Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
1003
In addition to acquiring the soluble protein fraction from microdissected serous cells, cell pellets were treated by a 1% SDS solution13,14 containing 20 mM Tris-HCl at pH 8.0, followed by centrifugation at 20000g for 30 min at room temperature. The supernatant containing extracted proteins was placed in a dialysis cup (Pierce) and dialyzed overnight at 4 °C against 100 mM TrisHCl at pH 8.2. The insoluble fraction yielded ∼50 µg of protein from 100 000 cells. The extracted and dialyzed proteins were denatured, reduced, alkylated, digested, desalted, and lyophilized using the same sample preparation protocol as applied to the soluble protein fraction described previously. By following the procedures described in the work of Wang and co-workers,19 cell pellets were also resuspended in a 20 mM Tris buffer (pH 8.0) containing 50% trifluoroethanol, followed by centrifugation at 20000g for 30 min. To be compatible with subsequent proteolytic digestion, the final trifluoroethanol concentration in the supernatant containing extracted proteins was reduced to 10% (v/v) by diluting 5-fold with 100 mM ammonium acetate at pH 8.0. The extracted and diluted proteins were denatured, reduced, alkylated, digested, desalted, and lyophilized using the same sample preparation protocol as applied to the soluble protein fraction described previously. Integrated CIEF/NanoRPLC Multidimensional Peptide Separations. On-line integration of CIEF with nanoRPLC as a multidimensional peptide and protein separation platform has been described in detail in previous work20-23 and was employed for systematically resolving peptide digests on the basis of their differences in isoelectric point (pI) and hydrophobicity. Briefly, an 80-cm-long CIEF capillary (100-µm i.d./365-µm o.d.) coated with hydroxypropyl cellulose was initially filled with a solution containing 2% ampholyte 3-10 and 1.5 mg/mL tryptic peptides. Peptide focusing was performed by applying an electric field strength of 300 V/cm and using solutions of 0.1 M acetic acid and 0.5% ammonium hydroxide as the anolyte and the catholyte, respectively. The current decreased continuously as the result of peptide focusing. Once the current reached ∼10% of the original value, usually within 30 min, the focusing was considered complete. Focused peptides were sequentially fractionated by hydrodynamic loading into individual trap columns (3 cm × 200-µm i.d. × 365-µm o.d.) packed with 5-µm porous C18 reversed-phase particles. A constant electric field of 300 V/cm was applied across the CIEF capillary for maintaining analyte band focusing in the capillary throughout the loading procedure. Each peptide fraction was subsequently analyzed by nanoRPLC equipped with an Ultimate dual-quaternary pump (Dionex, Sunnyvale, CA) and a dual nanoflow splitter connected to two pulled-tip, fused-silica capillaries (50-µm i.d. × 365-µm o.d.). These two 15-cm-long capillaries were packed with 3-µm Zorbax Stable Bond (Agilent, Palo Alto, CA) C18 particles. NanoRPLC separations were performed in parallel in which a dual-quaternary pump delivered two identical 2-h organic solvent gradients with an offset of 1 h. Peptides were eluted at a flow rate of 200 nL/min using a 5-45% linear acetonitrile gradient (containing 0.02% formic acid) over 100 min with the remaining 20 min for column regeneration and equilibration. Full scans were collected from 400 to 1400 m/z using a linear ion-trap mass spectrometer (LTQ, ThermoFinnigan, San Jose, CA), and five datadependent MS/MS scans were gathered with dynamic exclusion 1004 Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
Figure 1. Comparisons of protein solubilization and digestion efficiency using different sample preparation protocols. Lane 1, proteins extracted by SDS; lane 2, subsequent tryptic digestion of SDS-extracted proteins; lane 3, proteins extracted by trifluoroethanol; lane 4, subsequent tryptic digestion of trifluoroethanol-extracted proteins.
set to 18 s. A moving stage housing two nanoRPLC columns was employed to provide electrical contacts for applying electrospray voltages and, most importantly, to position the columns in-line with the orifice of the heated metal capillary in the nanoelectrospray ionization (ESI) source at the start of each chromatography separation and data acquisition cycle. The data presented are the result of a single CIEF/LC-MS/ MS run each of the soluble and insoluble fractions. Data Analysis. The Open Mass Spectrometry Search Algorithm (OMSSA) developed at the National Center for Biotechnology Information26 was used to search the peak list files against a decoyed SwissProt human database. SwissProt was chosen because it is the most highly annotated and least redundant protein sequence database available. This consequently provides a minimally redundant count of protein identifications without resorting to follow-on protein minimization data processing, for which no standard exists. This decoyed database was constructed by reversing all 12 484 real sequences and appending them to the end of the sequence library. Searches are performed using the following parameters: 1.5-Da precursor ion mass tolerance, 0.4-Da fragment ion mass tolerance, 1 missed cleavage, alkylated Cys as a fixed modification, and variable modification of Met oxidation. Searches were run in parallel on a 12-node, 24-CPU Linux cluster (Linux Networx, Bluffdale, UT). False positive rates were determined using the method of Elias and co-workers.27 Briefly, false positive rates were calculated by (26) Geer, L. Y.; Markey, S. P.; Kowalak, J. A.; Wagner, L.; Xu, M.; Maynard, D. M.; Yang, X.; Shi, W.; Bryant, S. H. J. Proteome Res. 2004, 3, 958-964. (27) Elias, J. E.; Haas, W.; Faherty, B. K.; Gygi, S. P. Nat. Methods 2005, 2, 667-675.
Figure 2. Overlaid plots containing the CIEF-UV trace monitored at 280 nm, the number of distinct peptides identified in each of the CIEF fractions, and the distribution of the peptide’s mean pI values over the entire CIEF separation.
multiplying the number of false positive identifications (hits to the reversed sequences scoring below a given threshold) by 2 and dividing by the number of total identifications. Peptides occurring as matches to the forward sequences were not counted as false positives. A curve was then generated by plotting E-value versus false positive rate, and an E-value threshold corresponding to a 1% false positive rate was used as the cutoff in this analysis. An OMSSA E-value is the expectation value that a tandem MS event represents the predicted peptide given a number of factors, including the quality of the match between the experimental and theoretical fragment ion spectra and the search space as determined by the sequence library, the precursor and fragment ion mass accuracy settings, enzyme specificity, missed cleavages, and modifications.26 As demonstrated in our previous studies,28 this decoyed database search approach not only accurately reduces false negative identifications but also controls the degree of false positives while improving the predictive power of a typical search engine. Because of the differences in the complexity of proteome samples, the sample preparation procedures, the proteome measurements, and the search parameters, it is recommended to perform a decoyed database search to determine specific threshold scores used for peptide and protein identifications for each experiment. After generation of search data, the result files were parsed and loaded into a custom MySQL database for visualization and reporting using in-house software. Peptide isoelectric points were calculated using iep, an EMBOSS package. For the purposes of this paper, a peptide hit is defined as an MS/MS event resulting in an identification meeting the stated criteria. A distinct peptide is defined as a discrete, nonredundant peptide sequence within the set of peptides sequenced by the experiment. A unique peptide is defined as a peptide that uniquely identifies a protein within the protein sequence library search, (28) Rudnick, P. A.; Wang, Y.; Evans, E. L.; Lee, C. S.; Balgley, B. M. J. Proteome Res. 2005, 4, 1353-1360.
that is, the peptide is found only within a single protein. A distinct protein is defined as a discrete protein sequence entry in the sequence library searched, in this case SwissProt. RESULTS AND DISCUSSION As reported previously,11,17-19 the use of organic solvent for membrane sample preparation not only improves protein solubility, but also assists protein denaturation. The organic solvents also readily evaporate during the lyophilization process without the need of any additional cleanup step prior to the proteome analysis. In addition to the use of organic solvent, several research groups have employed a variety of detergents13-16 to enrich the membrane protein fractions from cell lysates. For preparing membrane protein samples from microdissected tissue specimens, we therefore compared the use of SDS detergent and trifluoroethanol solvent with respect to their effectiveness on protein solubilization and subsequent proteolytic digestion. As evaluated by SDS-PAGE (Figure 1), the quality of proteins extracted from the pellets of microdissected ovarian tumor cells was significantly superior in the presence of SDS to using trifluoroethanol. The SDS-PAGE further indicated that there were no negative effects from the SDS detergent-based sample preparation protocol on the extent of subsequent tryptic digestion. On the other hand, TFE substantially interfered with the subsequent trypsin digestion to the point that analysis of the samples was deemed not worthwhile. Our comparative results among detergent- and organic solvent-based extraction methods are generally in agreement with those recently reported by Ruth and co-workers.16 Selective enrichment of ∼100 000 tumor cells was obtained from the microdissection of 5-10 consecutive sections in each tissue sample of epithelial ovarian carcinoma containing the serous cell type. An 80-cm-long CIEF capillary with a 100-µm i.d. providing ∼6.5 µL of solution loading volume was employed in this study. For each tissue proteome analysis, typically 10 µg of protein digests in a concentration of 1.5 µg/µL was loaded into the CIEF capillary, corresponding to proteins extracted from the pellets of Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
1005
Figure 3. Plots of the false positive rates and the numbers of total peptide, distinct peptide, and distinct protein identifications versus the E-value obtained from the search of the peak list files against a decoyed SwissProt human database using OMSSA. Table 1. Proteins Identified from Cell Pellets and Predicted To Contain Transmembrane Domains no. of transmembrane helices
no. of proteins identified
ratio of identified to total no. of predicted proteins (%)
g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 g13 g14 g15 g16 g17 g18 g19
773 315 234 203 152 133 108 73 50 36 25 15 4 3 2 1 1 1 1
22 16 14 14 12 11 11 21 19 16 14 13 9 8 8 5 5 6 6
20 000 targeted tumor cells using the SDS-based sample preparation protocol. As shown in Figure 2, the CIEF separation performance of tryptic peptides obtained from the membrane protein fraction was evaluated by hydrodynamically mobilizing focused peptides passing a UV detector placed near the cathodic end of the capillary. The entire content of focused peptides in the CIEF capillary was split into 18 individual fractions. All CIEF fractions parked in separate trap columns20-23 were further resolved by nanoRPLC, and the eluants were analyzed using nanoESI-LTQ-MS/MS. The number of distinct peptides identified in each of the 18 CIEF fractions is summarized in Figure 2, together with the distribution of the peptide’s mean calculated pI values over the entire CIEF separation. A total of 18 861 distinct peptides were identified using a 1% false positive rate, leading to the identification of 3303 proteins from the SwissProt human database containing 1006 Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
Figure 4. The overlap in membrane proteins (predicted to contain one or more transmembrane domains) identified from the soluble and pellet fractions of targeted ovarian tumor cells.
Figure 5. Distribution of PSLT-predicted subcellular location of proteins identified from the cell pellet fraction of a microdissected ovarian tumor specimen.
12 484 protein entries. Among the identified proteins, 2967 contain at least one peptide sequence that is unique in the protein sequence database to that protein; that is, it does not occur in any other protein. The large number of distinct peptide identifications measured from each CIEF fraction is attributed to the use of completely orthogonal resolving mechanisms in CIEF/nanoR-
Figure 6. Peptide coverage of representative transmembrane proteins, such as (A) CD81 and (B) ST14, and tandem MS spectra of unique peptides leading to their identifications.
PLC separations and rapid scanning LTQ-MS. By comparing with other IEF techniques, including immobilized pH gradient gels29,30 and gel-free approaches31-34 such as chromatofocusing, immobilized pH membranes, Rotofor, and free-flow electrophoresis, the ultrahigh resolving power of CIEF is evidenced by the large number of distinct peptide identifications measured from each CIEF fraction with low peptide fraction overlapping. In this case, 75% of distinct peptides were identified in only a single fraction. Of the remaining peptides, 16% were identified in two fractions, 4% in three fractions, 2% in four fractions, and 1% in five or six fractions. Furthermore, these IEF techniques are all operated at the preparative scale and are incompatible with minute proteome samples in the range of 0.1-10 µg typically obtained from microdissection-procured tissue specimens. As shown in Figure 3, the peptide and protein false positive rates and the numbers of total peptides, distinct peptides, and protein identifications were plotted as functions of the E-value of a typical OMSSA search. An E-value threshold of 0.05, corre(29) Cargile, B. J.; Talley, D. L.; Stephenson, J. L. Jr. Electrophoresis 2004, 25, 936-945. (30) Essader, A. S.; Cargile, B. J.; Bundy, J. L.; Stephenson, J. L., Jr. Proteomics 2005, 5, 24-34. (31) Yan, F.; Subramanian, B.; Nakeff, A.; Barder, T. J.; Parus, S. J.; Lubman, D. M. Anal. Chem. 2003, 75, 2299-2308. (32) Zuo, X.; Echan, L.; Hembach, P.; Tang, H. Y.; Speicher, K. D.; Santoli, D.; Speicher, D. W. Electrophoresis 2001, 22, 1603-1615. (33) Wall, D. B.; Kachman, M. T.; Gong, S.; Hinderer, R.; Parus, S.; Misek, D. E.; Hanash, S. M.; Lubman, D. M. Anal. Chem. 2000, 72, 1099-1111. (34) Moritz, R. L.; Ji, H.; Schutz, F.; Connolly, L. M.; Kapp, E. A.; Speed, T. P.; Simpson, R. J. Anal. Chem. 2004, 76, 4811-4824.
sponding to 1% false positive of total peptide identifications, was chosen as the cutoff in this study. This cutoff led to a protein false positive rate of 6.8% as indicated by the detection of peptides from 112 distinct reversed protein sequences in the decoy section of the search database. The first reversed protein was detected at an E-value of 1 × 10-5. At this threshold score, a total of 13 109 distinct peptides were identified, leading to the identification of 2775 nonredundant proteins. By tolerating at 1% false positive of total peptide identifications (E-value threshold of 0.05), an additional 5752 distinct peptides and 528 distinct proteins were measured at a cost of 131 and 112 predicted false identifications of distinct peptides and proteins, respectively. As shown in Figure 3, the protein false positive rate increased dramatically with increasing E-value above the threshold score of 0.05 allowed in this study. For example, at an E-value of 1.0, the false positive rates were 2.1 and 17.1% for total peptide and protein identifications, respectively. To better illustrate the impact of protein false positive rate on protein identification, it should be emphasized that new distinct proteins were added to the search results at a ratio of ∼50:1 relative to reversed distinct proteins at an E-value of 1 × 10-5. At an E-value of 0.05, corresponding to a protein false positive rate of 6.8%, this ratio had decreased to 2:1. This ratio was further reduced to 1:1 at an E-value of 1.0, meaning that new forward protein sequences were added at a rate equal to that of reversed protein sequences. The implication is that all new forward protein sequences were likely false positives at an E-value of 1.0 which Analytical Chemistry, Vol. 79, No. 3, February 1, 2007
1007
Figure 7. Representative network analysis of plasma membrane proteins identified from the cell pellet fraction of targeted ovarian tumor cells using the Ingenuity System.
corresponded to the false positive rates of 2.1 and 17.1% for total peptide and protein identifications, respectively. The observed rapid rise in the number of false protein assignments is unsurprising in that false positive MS/MS hit assignments by definition will occur randomly across the searched library of 12 484 protein sequences, whereas the true peptide hits will cluster among the subset of ∼3000 proteins detectable in the sample. Therefore, accepting new MS/MS hits at increasing E-values comes at a cost of adding false positive protein identifications at an increasing rate. This has been previously observed in studies using both Sequest35 and Peptide Prophet,36 and these results are consistent with those studies. In this study 2363 proteins (71.5%) were identified by 3 or more distinct peptides, 389 (11.7%) by 2 and 551 (16.7%) by 1 distinct peptide. Of the proteins identified by 1 distinct peptide, 314 were identified by peptides with E-values