Improved Membrane Proteomics Coverage of ... - ACS Publications

Nov 9, 2009 - erage of difficult cellular proteomes should allow for the discovery of new and ... substantial membrane proteome coverage achieved usin...
0 downloads 0 Views 376KB Size
Improved Membrane Proteomics Coverage of Human Embryonic Stem Cells by Peptide IPG-IEF Leon R. McQuade,† Uli Schmidt,‡ Dana Pascovici,† Tomas Stojanov,‡ and Mark S. Baker*,† Australian Proteome Analysis Facility, Faculty of Science, Macquarie University, NSW 2109, Australia, and Sydney IVF Stem Cells, Sydney, NSW 2000, Australia Received July 7, 2009

Abstract: Protein biomarkers are fundamental tools for the characterization of stem cells and for tracking their differentiation and maturation down developmental lineages. Technology development allowing increased coverage of difficult cellular proteomes should allow for the discovery of new and novel membrane protein biomarkers for use by the stem cell research community. The amphipathic and highly hydrophobic nature and relative low abundance of many membrane proteins present significant analytical challenges. These difficulties are amplified when the source material (tissue or cells) is only available in limited quantities (e.g., embryonic stem cells). Recent advances in enrichment for purer membrane fractions, the enzymatic and chemical digestion of membrane proteins in the presence of solvents or chaotropes, and the use of “shotgun” proteomics methodologies have gradually resulted in increased membrane proteome coverage with numbers of predicted integral membrane proteins now in excess of 1000 being routinely reported. We have recently demonstrated the advantages of using peptide isoelectric focusing in the first dimension on immobilized pH gradients (peptide IPG-IEF) followed by reversed phase chromatography and tandem MS to increase membrane proteome coverage. This study looked at achieving a similar level of membrane proteome coverage using modifications to reported methodologies while restricting the number of characterized human embryonic stem cells to 107 cells. Two-thousand twohundred and ninety-two (2292) nonredundant proteins were identified with two or more high accuracy peptide matches from 260 µg of a human embryonic stem cell membrane enriched fraction with a false discovery rate of 0.32%. Gene Ontology (GO) mapping predicted 1279 (44.9%) of this list to be membrane proteins of which 395 proteins were predicted to be derived from the plasma membrane compartment. The TMHMM algorithm predicted 904 integral membrane proteins with up to 16 * To whom correspondence should be addressed. Professor Mark S. Baker, Director, Australian Proteome Analysis Facility, Faculty of Science, Macquarie University, NSW 2109, Australia. Telephone: +61-2-9850-8211. Fax: +61-29850-6200. E-mail: [email protected]. † Australian Proteome Analysis Facility, Macquarie University. ‡ Sydney IVF Stem Cells.

5642 Journal of Proteome Research 2009, 8, 5642–5649 Published on Web 11/09/2009

transmembrane helices. Collectively, we assert that the substantial membrane proteome coverage achieved using these procedures will enable rapid advances in the identification and quantitation of novel membrane proteins as markers of differentiation status and/or genetic mutation from relatively low numbers of cultured embryonic stem cells. Keywords: membrane proteomics • human embryonic stem cells • conductivity • isoelectric focusing • IPG-IEF • LC-MS/MS

1. Introduction Comprehensive quantitative analysis of the proteome of embryonic and somatic stem cells could provide significant advances to understanding the nature of “stemness”, pluripotency and differentiation.1,2 The behavior of stem cells is tightly controlled by both intrinsic and extrinsic factors where “communication” is conducted through proteins either integral or peripheral to the plasma membrane. Considerable numbers of factors and pathways are involved in maintaining the pluripotent state of human embryonic stem cells (hESC) including the regulation of self-renewal and differentiation. Tracking specific differentiation of cells down the germ lineages (ectoderm, mesoderm and endoderm) is attracting global attention with efforts to more comprehensively understand the genomic and proteomic processes of cellular development to reach envisaged goals of cellular replacement therapies. Characterizing and tracking cellular changes is routinely undertaken with a spectrum of antibodies to cell surface, cytosolic or nuclear antigens. As cellular processes are characterized further so increases the need to identify new and novel biomarkers. Analysis of membrane proteins is an important field in proteomics on two fronts; it is estimated that 30-35% of eukaryotic genomes encode membrane proteins3 and approximately 70% of human pharmaceuticals currently on the market target membrane-bound proteins in some way.4 Despite the obvious need to better analyze membrane proteomes, this task remains challenging due to poor extraction techniques and/or poor solubility of membrane proteins during sample preparation and/or protein separation. This difficulty has been reflected as an under representation of membrane proteins in most proteomic studies. The amphipathic nature of integral membrane proteins (IMP) where the hydrophilic regions (ectodomains) protrude into the cytosol or extracellular environment from the lipid bilayer, while their hydrophobic pep10.1021/pr900597s CCC: $40.75

 2009 American Chemical Society

Improved hESC Membrane Proteomics by Peptide IPG-IEF tides are embedded into the membranous lipid bilayer contribute to the difficulty for MS analysis of IMPs.5,6 There have been recent improvements in the coverage and annotation of membrane proteins through the use of solvents for increased protein solubility, changes to protein and peptide fractionation and extended liquid chromatography separation in front of tandem MS analysis. Trypsin retains enzymatic activity in various solvents7 with 60% (v/v) methanol recently showing considerable utility for membrane protein digestion.8,9 Increasing protein solubility using alternative enzymes, combinations of enzymes, and acid labile salts have also been investigated.6,10 Membrane protein coverage is also being enhanced via protein and peptide fractionation including 1D SDS-PAGE,11 both offline and online strong cation exchange (SCX),8,12-17 capillary isoelectric focusing (cIEF)18,19 and immobilized pH gradient isoelectric focusing (IPG-IEF).9,20 Peptide IPG-IEF is reported to give higher confidence of proteins identified compared to 1D SDS-PAGE21 or separation conducted by SCX,22 while increasing gradient times during liquid chromatography separation have resulted in higher numbers of IMP being reported.9,12,16,18-20 Combinations of these various methodologies are now being employed to study the membrane proteome of embryonic stem cells.23-27 Membrane enriched fractions were most commonly prepared by sucrose density centrifugation and separated by SCX in front of tandem MS. The number of cells used for analysis ranged from 5 × 105 23 to approximately 4 × 109 27 and membrane protein identifications from 23527 to 1077.23 The aim of this study was to modify current front end techniques to improve the coverage of proteins identified from membrane enriched fractions of hESC. Based on the success achieved to date by researchers in our laboratory on improving membrane coverage with increased confidence,9,20 immobilized pH gradient isoelectric focusing (IPG-IEF) was chosen as the method for peptide fractionation in front of reversed phase nano LC-tandem MS. Two key starting points were considered: the number of cells to be used for the analysis and the characterization of those cells prior to analysis. The maintenance and propagation of hESC in a pluripotent state still remains technically challenging, given their inherent programming to differentiate, thereby restricting the number of undifferentiated cells that may be available for proteomic analysis. Concomitant with this is the need to characterize hESC populations immediately prior to proteomics analysis to provide greater strength to the interpretation of the proteomic data that is generated. To date, some publications on stem cell proteomics have characterized the cells using antibody markers of pluripotency by Western blotting. At best, this can only determine the relative presence of these markers, not the percentage of cells in any population that are actually expressing them.

2. Experimental Section The experimental procedure was worked up and modified using a colonic epithelial cell line (HCT116) which enabled rapid production of large cell numbers before applying the finalized protocol to the hES cell line. The protocol was applied three times during development using 107 HCT116 cells for each analysis. A “work flow” diagram for the optimized proteomic analysis is shown in Figure 1. Protein estimations taken during the procedure are shown on this figure.

technical notes 2.1. Materials and Equipment. Protease inhibitor tablets were purchased from Roche Applied Science (www.roche.com). MS grade trypsin was purchased from Promega (www.promega. com). IPG strips (pH 3-10; 18 cm), wicks and high grade mineral oil used for peptide focusing on an IPGphor II, all purchased from GE Healthcare Life Sciences (www1.gelifesciences.com). A Branson Sonifier 450 (www.bransonultrasonics.com) was used for probe sonication, while ultracentrifugation was done in a Sorvall Discovery; M120 SE (Thermo Scientific; www.thermo. com) using a S80AT3 rotor. Omix pipet tips C18, 100 µL capacity were used for desalting (Varian; www.varianinc.com) and glass screw neck vials (12 × 32 mm; P/N 186000384c; Waters Corporation; www.waters.com) were used for holding peptide solutions in the LC auto sampler. 2.2. Human Embryonic Stem Cells. The human embryonic stem cell (hESC) line SIVF001 was established under NH&MRC license #309703 following informed voluntary consent. The cell line analyzed as karyotypically normal 2n ) 46XX (?) has been maintained up to 90 passages in culture and has been shown capable of forming in vivo teratomas containing all three germ lineages (data not shown). The line was initially established on mitotically inactivated human fetal fibroblasts before being transferred to Matrigel (Becton Dickinson; www.bd.com) where it was cultured “feeder free” using media conditioned by human fetal fibroblasts. The line was harvested at passage 43 for proteomic analysis. Cells were gently scrapped from the tissue culture flask and were washed three times in PBS to minimize carryover of proteins from fetal bovine serum. Following the final wash, the PBS was aspirated and the cells snap frozen in liquid nitrogen and stored at -80 °C. In parallel to growing cells in flasks, cultures were set up on Matrigel coated 96-well optical bottom microtiter plates which were stained at harvest for markers of hESC pluripotency [Nanog (nuclear) and Tra-1-60 (cell surface)] using immunofluorescence and an IN Cell Analyzer 1000 (GE-Healthcare) high content analysis system to determine the percentage of cells stained positive for both protein markers. Total cell number was determined with a Nucleocounter (Chemometec; www. chemometec.com). 2.3. Protein Estimations. The amount of protein present in samples throughout the extraction procedure was determined at three points; (i) after initial cell lysis, (ii) following the initial low spin centrifugation step and (iii) following buffer exchange and the second ultracentrifugation step in ammonium bicarbonate. Protein was estimated using 20 µL of sample and a Pierce BCA Protein Assay Kit (Thermo Scientific; www.piercenet.com), following the manufacturers protocol. A Fluostar Optima microplate reader (BMG Labtech; www. bmglabtech.com) was used for colorimetric detection and protein estimation. Results for protein estimation are shown in Figure 1. 2.4. Cell Lysis, Membrane Enrichment and Buffer Exchange. A hypotonic lysis buffer (10 mM HEPES, 150 mM NaCl, 1 mM EDTA; pH 7.4) was prepared and stored at 4 °C. Prior to use 10 mL was filtered through a 0.22 µm filter. Cells frozen at -80 °C in a 15 mL centrifuge tube were thawed to 4 °C on ice. Filtered lysis buffer (1.5 mL) containing protease inhibitors was added to the cells and gently pipetted up and down, before returning to ice for 10 min. Cells were sonicated on ice using a probe sonicator (Branson Sonifier 450; 10 bursts; return to ice for 3 min; repeat 2-3 times) until lysis buffer appeared clear. (Protein estimation 1; Figure 1). Lysed cells were centrifuged at 1500 rpm for 10 min at 4 °C; (Protein Journal of Proteome Research • Vol. 8, No. 12, 2009 5643

technical notes

McQuade et al.

Figure 1. “Work flow” diagram of the proteomic analysis. Protein estimates were conducted during front end processing of ∼1 × 107 cells of the hESC line SIVF001 using a Pierce BCA Protein Assay Kit. (a) Conducted after initial cell lysis and sonication. (b) Conducted on supernatant after initial low speed centrifugation of lysed cells. (c) Conducted after buffer exchange and ultracentrifuge wash in 20 mM NH4HCO3 (pH 7.8). Importantly, 95.3% of the starting protein amount was retained after the initial low speed centrifugation and 48 307 peptides were recorded in the nonredundant list from 260 µg of protein in the membrane enriched pellet. Twenty microliters of sample was taken at each time point for analysis.

estimation 2; Figure 1). Supernatant was transferred to a 15 mL centrifuge tube and cold 0.1 M Na2CO3 (pH 11.0) to a final volume of 5 mL was added. The centrifuge tube was gently mixed for 60 min at 4 °C. The mixture was transferred to a 6 mL polycarbonate ultracentrifuge tube (Sorvall) and spun at 120 000g for 60 min at 4 °C. The membrane enriched pellet was gently washed with 400 µL of cold 20 mM NH4HCO3 (pH 7.8) and aspirated leaving the pellet in place. One milliliter of cold 20 mM NH4HCO3 was added and the pellet resuspended, before adding an additional 4.0 mL of 20 mM NH4HCO3 and pipetting gently. Following ultracentrifugation (as described above) the supernatant was gently discarded so as to not dislodge the pellet. Two-hundred microliters of fresh 20 mM NH4HCO3 was added and the pellet resuspended and then transferred to a 1.5 mL microcentrifuge tube on ice. The ultracentrifuge tube was washed with an additional 120 µL of 20 mM NH4HCO3 and transferred to the microcentrifuge tube. 5644

Journal of Proteome Research • Vol. 8, No. 12, 2009

The solution was sonicated (10 bursts; return to ice for 3 min; repeat 1-2 times) until no clumps were apparent. (Protein estimation 3; Figure 1). 2.5. Reduction, Alkylation and Digestion of Membrane Enriched Proteins. After sonication 40 µL of 100 mM DTT in 20 mM NH4HCO3 (pH 7.8) was added to the solution, vortexed briefly and incubated for 60 min at 37 °C. Fresh 200 mM IAA was prepared on 20 mM NH4HCO3. Forty microliters was added, vortexed gently and incubated in the dark for 30 min. (Preparation can be stored at -80 °C after alkylationsthaw preparation to room temp before proceeding with tryptic digestion). Six-hundred microliters of 100% LC grade methanol was added to the solution9 and vortexed briefly. Ten microliters (3 µg) of MS grade trypsin was added, vortexed briefly and sonicated in a water bath for 20 min. Following incubation at 37 °C for two hours, an additional 10 µL (3 µg) of trypsin was added, vortexed and the incubation continued at 37 °C for 5 h.

Improved hESC Membrane Proteomics by Peptide IPG-IEF

technical notes

a

Table 1. Conductivity Table

reagent/protein preparation

conductivity (µS/cm)

0.1 M Na2CO3 (pH 11.0) 20 mM NH4HCO3 (pH 7.8) 60% MeOH/NH4HCO3 8 M urea (prepared fresh and filtered) Protein preparation following reduction and alkylation. Preparation following reduction, alkylation and tryptic digestion. Peptide preparation following suspension in 8 M urea.

12 700 1670 340 9 3500 670 420

a The conductivity of some of the reagents used in the procedure and the conductivity of some of the protein/peptide preparations during the procedure are shown.

The solution was vortexed briefly each hour during incubation and stored at 4 °C overnight. 2.6. Conductivity. The presence of high salt concentrations results in poor peptide focusing on IPG strips and a failure of the program on the IPGphor II to accumulate 100 kV hours with possible scorching of the IPG strip. Conductivity within the range 300-400 µS/cm is required for optimal focusing. During the procedure, conductivity was measured using a Twin Cond conductivity meter (Horiba; www.horiba.com) and results are shown in Table 1. 2.7. Peptide Focusing on Immobilized pH Gradient, Peptide Retrieval and Desalting. Immobilized pH gradient (IPG) strips were rehydrated for a minimum of 6 h before focusing. Tryptic digested proteins were dried down in a vacuum centrifuge with care taken not to dry to completeness. Two-hundred and fifty microliters of freshly prepared and filtered 8 M urea was added and the peptides resuspended by pipetting up and down. One microlitre of concentrated bromophenol blue in 8 M urea was added and vortexed briefly. IPG strip (pH 3-10; 18 cm) stored at -80 °C were thawed to room temperature before overlaying the gel side of the strip onto the peptide solution spotted into the well of a rehydration tray. The tray was covered and left for six hours for complete rehydration. Focusing was conducted overnight using an IPGphor II following manufacturers recommendations and employing a program previously reported9 to a maximum of 100 kV hours. Following focusing the peptides were recovered by cutting the strip into 24 equal sized pieces and the peptides eluted from each piece (three times) into 100 µL of freshly prepared 0.1% formic acid. The eluates from each piece were pooled and desalted using Omix C18 pipet tips following the manufacturers recommended protocol, after which the peptides were dried down in a vacuum centrifuge ensuring over drying did not occur. Ten microliters of fresh 0.1% formic acid was added to each tube, the peptides resuspended by pipetting and transferred into 12 × 32 mm glass screw neck vials (Waters Corporation). Vials can be stored at -80 °C prior to LC-MS/ MS. 2.8. Reversed Phase LC-Tandem MS. Each of the 24 fractions were analyzed by reversed phase nano LC-MS/ MS as previously described9 with the following modifications; an LTQ-linear ion-trap mass spectrometer (Thermo Finnigan, www.thermo.com) was used. Samples from each fraction were separated over 90 min gradients by using a Tempo nanoLC system (Applied Biosystems; www.appliedbiosystems.com). Ten microliters of fraction sample were injected onto a peptide Captrap (Michrom; www.michrom.com) for preconcentration and de-

Figure 2. Numbers of proteins with transmembrane helices predicted by the trans membrane hidden Markov model (TMHMM) algorithm. The predicted number of proteins with transmembrane helices is 898 representing 31.49% of the nonredundant protein list. Proteins with up to 16 transmembrane helices were identified.

salting with 0.1% formic acid, 2.5% MeCN, at 5 µL/min. The peptide trap was then switched on line with the analytical reversed phased column. Peptides were eluted from the column using 10-35% of a buffer [95% (v/v) ACN, 0.1% (v/v) formic acid] for 58 min, 35-95% of the buffer for 5 min, hold at 95% for 5 min, 95-5% for 10 min and hold at 5% for 12 min with a flow rate of 700 nL/min across the gradient. The column eluate was directed into a nanospray ionization source of the mass spectrometer. A 1.8 kV electrospray voltage was applied via a liquid junction upstream of the column. Spectra were scanned over the range 400-1500 amu. Automated peak recognition, dynamic exclusion, and tandem MS of the top six most intense precursor ions at 40% normalization collision energy were performed using Xcalibur software (Thermo Finnigan). 2.9. Data Analysis. Raw files were converted to mzXML format from each of the 24 fraction files and processed sequentially through the global proteome machine (GPM) software (www.thegpm.org).28,29 A merged, nonredundant output file was generated for protein identifications with log (e) values less than -1. Peptide identification was determined using a 0.4 Da fragment ion tolerance. Carbamidomethyl was considered as a complete modification, and partial modifications were also considered, which included oxidation of methionine and threonine and deamidation of asparagine and glutamine. MS/MS spectra were searched against the Homo sapiens database (database derived from SwissProt, Ensembl and NCBI), and reverse database searches were used in the estimation of false discovery rates.30 Protein annotation for the identified proteins was downloaded using the BioMart (http://www.ensembl.org/info/ data/biomart.html) data mining tool provided by Ensembl, database version release 52.31 External gene ontology (GO)32 information for cellular component categories was downloaded from the same source and matched to the local list of identified proteins. The main focus of extracting the GO annotation was to identify membrane proteins and membrane protein subcategories. For those proteins for which GO information was not available, the FASTA sequences were extracted and subcellular localization was predicted using the pTARGET33 prediction server (http://bioapps.rit.albany.edu/ pTARGET/). In addition proteins with transmembrane segments were predicted using the Trans Membrane Hidden Markov Model (TMHMM: http://www.cbs.dtu.dk/services/TMHMM2.0/).34 The available GO information for the identified proteins was summarized and submitted to the web gene ontology Journal of Proteome Research • Vol. 8, No. 12, 2009 5645

technical notes

McQuade et al. a

Table 2. List of Proteins Identified with g10 Transmembrane Segments Derived by the TMHMM Algorithm predicted no. of helices

16

brief description

Swiss-Prot accession no’s.

predicted no. of helices

brief description

Swiss-Prot accession no’s.

15 14

Multidrug resistance-associated protein 1 NADH-ubiquinone oxidoreductase chain 5 Zinc transporter 5 E3 ubiquitin-protein ligase MARCH6

14

Cationic amino acid transporter 3

Q8WY07

11

14

Sodium-dependent multivitamin transporter Transmembrane protein 15 Transmembrane protein C9orf5 GPI ethanolamine phosphate transferase 3 High affinity cationic amino acid transporter 1 GPI ethanolamine phosphate transferase 1 Solute carrier family 15 member 4 Multidrug resistance-associated protein 7 Protein unc-93 homologue B1 Sodium/hydrogen exchanger 1

Q9Y289

11

Sodium bicarbonate cotransporter 3 Probable cation-transporting ATPase 13A2 Dolichyl-P-Man:Man(7)GlcNAc(2)-PPdolichyl-alpha-1,6-mannosyltransferase Sodium bicarbonate cotransporter 3

Q9UPQ8 Q9H330 Q8TEQ8

11 11 11

Protein dpy-19 homologue 3 Glucose-6-phosphate translocase Adenylate cyclase type 6

Q6ZPD9 O43826 O43306

P30825

11

Q2Y0W8

O95427

11

Electroneutral sodium bicarbonate exchanger 1 Acetyl-coenzyme A transporter 1

Q8N697 Q5T3U5

11 11

Chloride channel protein 3 Transmembrane protein 63A

P51790 O94886

Q9H1C4 P19634

11 11

P53985 Q8WUX1

Q13423

11

12 12 12

NAD(P) transhydrogenase, mitochondrial Precursor Niemann-Pick C1 protein Precursor Cystine/glutamate transporter Solute carrier family 12 member 4

Monocarboxylate transporter 1 Sodium-coupled neutral amino acid transporter 5 Multidrug resistance-associated protein 4

O15118 Q9UPY5 Q9UP95

11 11 11

Q8IWA5 Q99808 Q8NBW4

12

Sodium/hydrogen exchanger 7

Q96T83

10

12

Solute carrier family 12 member 2

P55011

10

12

P31641

10

12 12 12

Sodium- and chloride-dependent taurine transporter Cytochrome c oxidase subunit 1 Synaptic vesicle glycoprotein 2A Sodium/hydrogen exchanger 6

P00395 Q7L0J3 Q92581

10 10 10

12

Monocarboxylate transporter 4

O15427

10

12 12

Two pore calcium channel protein 1 Dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit STT3A Solute carrier family 2, facilitated glucose transporter member 1 Uncharacterized protein C20orf54 Precursor

Q9ULQ1 P46977

10 10

Choline transporter-like protein 2 Equilibrative nucleoside transporter 1 Putative sodium-coupled neutral amino acid transporter 9 Solute carrier family 2, facilitated glucose transporter member 3 Protein transport protein Sec61 subunit alpha isoform 1 Probable cation-transporting ATPase 13A3 Protein O-mannosyl-transferase 2 Solute carrier family 12 member 9 Sodium/potassium-transporting ATPase subunit alpha-1 Precursor Dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit STT3B Protein dpy-19 homologue 1 Major facilitator superfamily domain-containing protein 10

P11166

10

Q9NQ40

10

11

Serine incorporator 3

Q13530

10

11

Large neutral amino acids transporter small subunit 1 Chloride channel protein 7 Major facilitator superfamily domain-containing protein 1 Solute carrier family 12 member 7 Alpha-1,2-glucosyltransferase ALG10-A

Q01650

10

P51798 Q9H3U5

10 10

Q9Y666 Q5BKT4

10 10

15

14 14 14 14 14 13 13 12 12 12

12 11

11 11 11 11

P33527

11

Anion exchange protein 3

Q8NHX9

P03915

11

Two pore calcium channel protein 2

P48751

Q8TAD4 O60337

11 11

Q9Y6M7 Q9NQ11 Q9BV10 Q9Y6M7

O00400

O15439

P11169 P61619 Q9H7F0 Q9UKY4 Q9BXP2 P05023 Q8TCJ2 Q2PZI1 Q14728

Transmembrane protein C2orf18 Precursor UDP-N-acetylglucosamine--dolichylphosphate N-acetylglucosaminephosphotransferase NADH-ubiquinone oxidoreductase chain 4 Solute carrier family 41 member 1

Q8N357

Transmembrane channel-like protein 6 Transmembrane 9 superfamily member 4 Precursor Ethanolaminephosphotransferase 1 Dolichyl pyrophosphate Man9GlcNAc2 alpha-1,3-glucosyltransferase

Q7Z403 Q92544

Q9H3H5 P03905 Q8IVJ1

Q9C0D9 Q9Y672

a Included in this list are two GPI-anchored proteins (GPI ethanolamine phosphate transferases 1 and 3). These “extrinsic” membrane associated proteins are not commonly recovered in membrane preparations.

annotation plotting tool WEGO35 (http://wego.genomics. org.cn/cgi-bin/wego/index.pl) to generate summaries for cellular component ontology, and higher level detail for the membrane subcategories present in the sample. These summaries rely specifically on the GO identifiers and their relationship, and do not include the predicted subcellular localization. 5646

Journal of Proteome Research • Vol. 8, No. 12, 2009

3. Results and Discussion At time of cell harvest, 87 ( 4% of the SIVF001 hESC line were positive for Nanog and 88 ( 2% were positive for Tra-1-60, with a total of 1.35 × 107 cells used for proteomic analysis. BCA protein estimates conducted at three time points during the front end procedure are shown in Figure 1. Importantly

technical notes

Improved hESC Membrane Proteomics by Peptide IPG-IEF 95.3% (4.12 mg) of the initial protein amount was retained after lysis and the first centrifugation. Following membrane stripping by sodium carbonate,36,37 ultracentrifugation, buffer exchange and reultracentrifugation 260 µg of protein from the membrane enriched fraction was digested with trypsin and loaded onto the IPG strip. Sodium carbonate (pH 11.0) was chosen as the basic pH deprotonates proteins, minimizes the formation of micelles and initiates linearization of proteins. Ammonium bicarbonate (pH 7.8) was chosen for the buffer exchange and membrane washing as it is a volatile salt that is suitable for reduction, alkylation and tryptic digestion and in the presence of 60% methanol has a low conductivity required for optimizing peptide focusing on IPG strips (see Table 1). Additionally, peptide IPG-IEF has previously been reported to be a highcapacity and high resolution analytical tool which provides the fractionation required for the first dimension of “shotgun” proteomics analysis.9 The nonredundant list generated by GPM contained 48 307 peptides representing 2851 proteins with log (e) values greater than -1 and a false discovery rate of 0.32%. Of these 2293 proteins (80.43%) were identified from g2 peptides. The list was annotated as described in Section 2.9 (Data Analysis). It is salient to bear in mind that the procedure enriches for membrane bound proteins potentially derived from many subcellular organelles such as mitochondria, ribosomes and any remnant nuclei not precipitated in the first centrifugation step. The procedure does not result in a composition homogeneous for membrane bound proteins, with nonmembranous proteins also being present. The effectiveness of the protocol to improve coverage of membrane proteomes was analyzed. Of the identified proteins 858 (31.49%) were predicted to have transmembrane segments via the TMHMM algorithm. A histogram of the generated data is presented in Figure 2 and shows that proteins with up to 16 transmembrane helices were retrieved using the extraction procedure employed and peptide IGP - IEF. Those proteins with g10 transmembrane segments are listed in Table 2 with the full list available as supplementary data. In addition, one should remember that there are other categories of membrane protein not represented in this analysis. For example, extrinsic proteins that are peripheral to but do not interact with the hydrophobic core of the phospholipid bilayer are absent here as are other lipid-anchored (e.g., farnesylated, prenylated and guanylated) membrane proteins. While GPI-anchored proteins would also not be expected, 10 were identified in the analysis; 2 of these (GPI ethanolamine phosphate transferases 1 and 3) appear in Table 2. To predict cellular components summarized gene ontology (GO) terms derived via pTARGET were submitted to WEGO, as described in Section 2.9 (Data Analysis). The cellular components in Table 3 show 1279 proteins were predicted to be membrane proteins, representing 44.9% of the nonredundant protein list. This analysis also indicated that 395 of the identified proteins mapped to the plasma membrane. These results compare favorably or exceed those recently reported from other embryonic stem cell lines. Table 5 highlights two key points; there has been a marked increase in the number of identified membrane proteins resulting from lower numbers of cells and that less manipulation during the “front-end” procedure results in increased identifications. Prokhorova et al.24 reported 811 membrane proteins from an unreported number of SILAC labeled HUES-9 hESC, while Dormeyer et al.23 reported 1077 membrane proteins from 5 × 105 HUES-7 hESC.

Table 3. Cellular Component Analysis of SIVF001 hESCa no. proteins

percentage

GO term

cellular component

1279 395 42 552 700 205

44.9 13.9 1.5 19.4 24.5 7.2

GO:0016020 GO:0005886 GO:0019867 GO:0031090 GO:0005634 GO:0042175

2 1034 2 25

0.1 36.3 0.1 0.9

GO:0042734 GO:0044425 GO:0045211 GO:0048475

membrane plasma membrane outer membrane organelle membrane nucleus nuclear envelope-ER network presynaptic membrane membrane part postsynaptic membrane coated membrane

a The cellular component list was generated via WEGO based on GO terms. This list reflects the efficiency of the technique in retrieving membrane proteins (44.9%) and allows assignment of 395 proteins to the plasma bilipid membrane via GO mapping. The percentages do not total 100 as proteins can map with multiple GO terms.

Table 4. Cellular Component Analysis of Three HCT116 Cell Linesa no. proteins

percentage

GO term

cellular component

1069:1043:1664 346:350:498 43:42:58 478:459:683 174:161:252

47.5:49.8:41.5 15.4:16.7:12.4 1.4:2.0:1.9 21.2:21.9:17.0 7.7:7.7:6.3

GO:0016020 GO:0005886 GO:0019867 GO:0031090 GO:0042175

1:1:2 869:839:1326 1:1:4 19:18:31

0.0:0.0:0.0 38.6:40.1:33.1 0.0:0.0:0.1 0.8:0.9:0.8

GO:0042734 GO:0044425 GO:0045211 GO:0048475

membrane plasma membrane outer membrane organelle membrane nuclear envelope-ER network presynaptic membrane membrane part postsynaptic membrane coated membrane

a The tumorogenic cell line HCT116 was used for analysis during the development of the method. Approximately 107 cells were used for each analysis. The higher protein numbers derived from the last analysis indicates improvement in the protocol efficiency to retrieve proteins. Importantly, when compared with Table 3 generated from the hESC line there is a high consistency in the percentage of proteins belonging to each “Cellular Component” category, reflecting reproducibility of the protocol. The cellular component list was generated via WEGO based on GO terms. The percentages do not total 100 as proteins can map with multiple GO terms.

These authors also applied cellular component GO mapping and reported 237 specific plasma membrane proteins from that particular cell line. While none of these authors reported on the amount of isolated protein used for membrane protein analysis it is relevant to bear in mind that the current protocol utilized only 260 µg of protein for the generation of 1,279 membrane proteins. Additionally, our current findings exceed those from a recent analysis of the membrane proteome of metastatic cancer cells which employed stable isotope labeling and isolation via sucrose density centrifugation. From a nonreported starting number of cells a list of 1919 proteins were identified containing 622 membrane proteins.38 Our non redundant protein list was further searched for an arbitrarily derived list of 63 proteins commonly reported to be associated with hESC lines. Fifty-eight (92.1%) of these were found including seventy two CD antigen markers, both HLA Class I and II antigens, fibroblast growth factors (including FGF2), pluripotency markers SSEA, Tra-1, Lin 28, Nanog and Rex. The three nuclear associated proteins (Lin 28, Nanog and Rex) are from a group of 700 nuclear proteins annotated by GO (Table 3). Nestin, described as an early marker of ectodermal differentiation39 was also identified. An average of 89% of the SIVF001 cells were found by immunofluorescence to be positive Journal of Proteome Research • Vol. 8, No. 12, 2009 5647

technical notes

McQuade et al. a

Table 5. Comparison of Recent Membrane Proteomic Analyses of Embryonic Stem Cells Authors

No. of cells analyzed Technique

LC-MS/MS

No. of Membrane proteins identified

Nunomura, K et al.27

4 × 10

9

Biotin/sucrose density gradient/avidin affinity chromatography/SCX LC-Q-Tof-2 (170 min gradient) 324

Dormeyer, W et al.23

5 × 10

Prokhorova, TA et al.24

Current

Not stated

1 × 10

Sucrose density gradient/SCX

SILAC/sucrose/ sodium carbonate

Sodium carbonate/ peptide IPG-IEF

LC-LTQ-Orbitrap (120 min gradient)

LC-LTQ-Orbitrap (gradient time not reported) 752

LC-LTQ-Linear ion trap (90 min gradient)

5

1077 (237 plasma membrane)

7

1279 (395 plasma membrane)

a This table shows that over the period 2005 to current, there has been an increase in the numbers of membrane proteins identified from embryonic stem cells and that the numbers of cells required for this increased identification has been decreasing. While the improvement in identification can in part be attributed to advances in mass spectrometry technology and bioinformatic analyses, alteration to LC gradient times and lessening the amount of handling in the front end of the procedure appear to favor higher returns.

for the two pluripotency markers (Nanog and Tra-1-60) at time of cell harvest suggesting that the remaining 11% may have commenced differentiation. Since the ectodermal lineage is the first to derive from pluripotent cells and is routinely characterized by positive nestin antibody staining, the reporting of this protein in the nonredundant list is not surprising. Results obtained from analyses of the three HCT116 lines indicate strong reproducibility of the technique employed. An average of 2640 ( 766 proteins were identified, with an average of 2085 ( 617 proteins identified from g2 peptides. When the data derived from the three cell lines was annotated via WEGO to predict cellular components (Table 4) the percentage of proteins for each GO category shows high concordance between the three analyses. This concordance is maintained between the cellular component categories from HCT116 and those from the hESC line.

4. Conclusions This study has shown that modifications to reported preparation procedures for the enrichment of membranes in conjunction with “shotgun” proteomic analysis by available technology has resulted in improved membrane coverage from 260 µg of protein derived from approximately 10 million hESC. The coverage was enhanced by avoiding protein loss during cellular preparation, applying modified membrane stripping procedures, the use of a solvent during tryptic digestion, monitoring conductivity in front of peptide focusing and increasing the time of the liquid chromatography gradient. The reproducibility seen in the development and final application of this technique will make it easily amenable to “discovery” proteomics. The first step for finding “plausible” protein biomarkers is dependent upon a global proteomic analysis of cells with variations in genotype/phenotype. This analysis will be enhanced by the ability to mine deeper into the proteome to detect the presence of less abundant proteins showing expression variation. The application of peptide IPG-IEF to this endeavor recently revealed that greater coverage of the membrane proteome is achievable by the use of IPG strips with various pH ranges.20,40 This highlights two of the advantages of IPG-IEF; that theoretical and observed peptide pI from IPGIEF have a high concordance and that IPG strips with restricted pH ranges can increase protein identifications in excess of 50% from those identified from normal broad range (pH 3-10) strips. As a “front end” procedure IPG-IEF can be used for detecting and quantitating protein changes using current 5648

Journal of Proteome Research • Vol. 8, No. 12, 2009

methodologies such a “label” free9 or iTRAQ40,41 and will enhance the discovery of new or novel membrane biomarkers of stem cells in key research areas such as lineage differentiation or disease. Application of recent biostatistical tools has enabled the datum set to be analyzed in more detail resulting in greater resolution of membrane subcategories, in particular the plasma membrane category.

Acknowledgment. We thank Iveta Slapetova for input and technical assistance in culturing of the HCT116 cell line and modifications to the protocol and Dylan Xavier (Scientific Officer, APAF) for technical involvement in the running of the nano LC-MS/MS. Supporting Information Available: A table of nonredundant proteins containing sum of raw spectrum intensities, number of peptides found, pI, identifications and descriptions. The complete table of the proteins predicted to have transmembrane helices by the trans membrane hidden Markov model (TMHMM). A table of the arbitrarily derived proteins found that are commonly reported from human embryonic stem cell lines including both gene and protein identifiers and normalized spectral abundance frequencies (NSAF). This material is available free of charge via the Internet at http:// pubs.acs.org. References (1) Baharvand, H.; Fathi, A.; van Hoof, D.; Salekdeh, G. H. Concise review: trends in stem cell proteomics. Stem Cells 2007, 25 (8), 1888–903. (2) Krijgsveld, J.; Whetton, A. D.; Lee, B.; Lemischka, I.; Oh, S.; Pera, M.; Mummery, C.; Heck, A. J. Proteome biology of stem cells: a new joint HUPO and ISSCR initiative. Mol. Cell. Proteomics 2008, 7 (1), 204–5. (3) Ahram, M.; Litou, Z. I.; Fang, R.; Al-Tawallbeh, G. Estimation of membrane proteins in the human proteome. In Silico Biol. 2006, 6 (5), 379–86. (4) Hopkins, A. L.; Groom, C. R. The druggable genome. Nat. Rev. Drug Discovery 2002, 1 (9), 727–30. (5) Marmagne, A.; Salvi, D.; Rolland, N.; Ephritikhine, G.; Joyard, J.; Barbier-Brygoo, H. Purification and fractionation of membranes for proteomic analyses. Methods Mol. Biol. 2006, 323, 403–20. (6) Wu, C. C.; Yates, J. R. The application of mass spectrometry to membrane proteomics. Nat. Biotechnol. 2003, 21 (3), 262–7. (7) Russell, W. K.; Park, Z. Y.; Russell, D. H. Proteolysis in mixed organic-aqueous solvent systems: applications for peptide mass mapping using mass spectrometry. Anal. Chem. 2001, 73 (11), 2682–5. (8) Blonder, J.; Goshe, M. B.; Moore, R. J.; Pasa-Tolic, L.; Masselon, C. D.; Lipton, M. S.; Smith, R. D. Enrichment of integral membrane

technical notes

Improved hESC Membrane Proteomics by Peptide IPG-IEF

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

(24)

proteins for proteomic analysis using liquid chromatographytandem mass spectrometry. J. Proteome Res. 2002, 1 (4), 351–60. Chick, J. M.; Haynes, P. A.; Molloy, M. P.; Bjellqvist, B.; Baker, M. S.; Len, A. C. Characterization of the Rat Liver Membrane Proteome Using Peptide Immobilized pH Gradient Isoelectric Focusing. J. Proteome Res. 2008, 7 (3), 1036–45. Ruth, M. C.; Old, W. M.; Emrick, M. A.; Meyer-Arendt, K.; AvelineWolf, L. D.; Pierce, K. G.; Mendoza, A. M.; Sevinsky, J. R.; Hamady, M.; Knight, R. D.; Resing, K. A.; Ahn, N. G. Analysis of membrane proteins from human chronic myelogenous leukemia cells: comparison of extraction methods for multidimensional LC-MS/MS. J. Proteome Res. 2006, 5 (3), 709–19. Andon, N. L.; Hollingworth, S.; Koller, A.; Greenland, A. J.; Yates, J. R., 3rd; Haynes, P. A. Proteomic characterization of wheat amyloplasts using identification of proteins by tandem mass spectrometry. Proteomics 2002, 2 (9), 1156–68. Blonder, J.; Rodriguez-Galan, M. C.; Chan, K. C.; Lucas, D. A.; Yu, L. R.; Conrads, T. P.; Issaq, H. J.; Young, H. A.; Veenstra, T. D. Analysis of murine natural killer cell microsomal proteins using two-dimensional liquid chromatography coupled to tandem electrospray ionization mass spectrometry. J. Proteome Res. 2004, 3 (4), 862–70. Blonder, J.; Chan, K. C.; Issaq, H. J.; Veenstra, T. D. Identification of membrane proteins from mammalian cell/tissue using methanolfacilitated solubilization and tryptic digestion coupled with 2DLC-MS/MS. Nat. Protoc. 2006, 1 (6), 2784–90. Blonder, J.; Goshe, M. B.; Xiao, W.; Camp, D. G., 2nd; Wingerd, M.; Davis, R. W.; Smith, R. D. Global analysis of the membrane subproteome of Pseudomonas aeruginosa using liquid chromatography-tandem mass spectrometry. J. Proteome Res. 2004, 3 (3), 434–44. Blonder, J.; Terunuma, A.; Conrads, T. P.; Chan, K. C.; Yee, C.; Lucas, D. A.; Schaefer, C. F.; Yu, L. R.; Issaq, H. J.; Veenstra, T. D.; Vogel, J. C. A proteomic characterization of the plasma membrane of human epidermis by high-throughput mass spectrometry. J. Invest. Dermatol. 2004, 123 (4), 691–9. Wang, H.; Qian, W. J.; Chin, M. H.; Petyuk, V. A.; Barry, R. C.; Liu, T.; Gritsenko, M. A.; Mottaz, H. M.; Moore, R. J.; Camp Ii, D. G.; Khan, A. H.; Smith, D. J.; Smith, R. D. Characterization of the mouse brain proteome using global proteomic analysis complemented with cysteinyl-peptide enrichment. J. Proteome Res. 2006, 5 (2), 361–9. Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R. A method for the comprehensive proteomic analysis of membrane proteins. Nat. Biotechnol. 2003, 21 (5), 532–8. Wang, W.; Guo, T.; Rudnick, P. A.; Song, T.; Li, J.; Zhuang, Z.; Zheng, W.; Devoe, D. L.; Lee, C. S.; Balgley, B. M. Membrane proteome analysis of microdissected ovarian tumor tissues using capillary isoelectric focusing/reversed-phase liquid chromatography-tandem MS. Anal. Chem. 2007, 79 (3), 1002–9. Wang, W.; Guo, T.; Song, T.; Lee, C. S.; Balgley, B. M. Comprehensive yeast proteome analysis using a capillary isoelectric focusing-based multidimensional separation platform coupled with ESI-MS/MS. Proteomics 2007, 7 (8), 1178–87. Chick, J. M.; Haynes, P. A.; Bjellqvist, B.; Baker, M. S. A combination of immobilised pH gradients improves membrane proteomics. J. Proteome Res. 2008, 7 (11), 4974–81. Krijgsveld, J.; Gauci, S.; Dormeyer, W.; Heck, A. J. In-gel isoelectric focusing of peptides as a tool for improved protein identification. J. Proteome Res. 2006, 5 (7), 1721–30. Slebos, R. J.; Brock, J. W.; Winters, N. F.; Stuart, S. R.; Martinez, M. A.; Li, M.; Chambers, M. C.; Zimmerman, L. J.; Ham, A. J.; Tabb, D. L.; Liebler, D. C. Evaluation of Strong Cation Exchange versus Isoelectric Focusing of Peptides for Multidimensional Liquid Chromatography-Tandem Mass Spectrometry. J. Proteome Res. 2008, 7 (12), 5286–94. Dormeyer, W.; van Hoof, D.; Braam, S. R.; Heck, A. J.; Mummery, C. L.; Krijgsveld, J. Plasma Membrane Proteomics of Human Embryonic Stem Cells and Human Embryonal Carcinoma Cells. J. Proteome Res. 2008, 7 (7), 2936–51. Prokhorova, T. A.; Rigbolt, K. T.; Johansen, P. T.; Henningsen, J.; Kratchmarova, I.; Kassem, M.; Blagoev, B. SILAC-labeling and quantitative comparison of the membrane proteomes of selfrenewing and differentiating human embryonic stem cells. Mol. Cell. Proteomics 2009.

(25) Hayman, M. W.; Przyborski, S. A. Proteomic identification of biomarkers expressed by human pluripotent stem cells. Biochem. Biophys. Res. Commun. 2004, 316 (3), 918–23. (26) Nagano, K.; Yoshida, Y.; Isobe, T. Cell surface biomarkers of embryonic stem cells. Proteomics 2008, 8 (19), 4025–35. (27) Nunomura, K.; Nagano, K.; Itagaki, C.; Taoka, M.; Okamura, N.; Yamauchi, Y.; Sugano, S.; Takahashi, N.; Izumi, T.; Isobe, T. Cell surface labeling and mass spectrometry reveal diversity of cell surface markers and signaling molecules expressed in undifferentiated mouse embryonic stem cells. Mol. Cell. Proteomics 2005, 4 (12), 1968–76. (28) Craig, R.; Beavis, R. C. A method for reducing the time required to match protein sequences with tandem mass spectra. Rapid Commun. Mass Spectrom. 2003, 17 (20), 2310–6. (29) Craig, R.; Beavis, R. C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 2004, 20 (9), 1466–7. (30) Peng, J.; Elias, J. E.; Thoreen, C. C.; Licklider, L. J.; Gygi, S. P. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2003, 2 (1), 43– 50. (31) Flicek, P.; Aken, B. L.; Beal, K.; Ballester, B.; Caccamo, M.; Chen, Y.; Clarke, L.; Coates, G.; Cunningham, F.; Cutts, T.; Down, T.; Dyer, S. C.; Eyre, T.; Fitzgerald, S.; Fernandez-Banet, J.; Graf, S.; Haider, S.; Hammond, M.; Holland, R.; Howe, K. L.; Howe, K.; Johnson, N.; Jenkinson, A.; Kahari, A.; Keefe, D.; Kokocinski, F.; Kulesha, E.; Lawson, D.; Longden, I.; Megy, K.; Meidl, P.; Overduin, B.; Parker, A.; Pritchard, B.; Prlic, A.; Rice, S.; Rios, D.; Schuster, M.; Sealy, I.; Slater, G.; Smedley, D.; Spudich, G.; Trevanion, S.; Vilella, A. J.; Vogel, J.; White, S.; Wood, M.; Birney, E.; Cox, T.; Curwen, V.; Durbin, R.; Fernandez-Suarez, X. M.; Herrero, J.; Hubbard, T. J.; Kasprzyk, A.; Proctor, G.; Smith, J.; Ureta-Vidal, A.; Searle, S. Ensembl 2008. Nucleic Acids Res. 2008, 36 (Database issue), D707– 14. (32) Ashburner, M.; Ball, C. A.; Blake, J. A.; Botstein, D.; Butler, H.; Cherry, J. M.; Davis, A. P.; Dolinski, K.; Dwight, S. S.; Eppig, J. T.; Harris, M. A.; Hill, D. P.; Issel-Tarver, L.; Kasarskis, A.; Lewis, S.; Matese, J. C.; Richardson, J. E.; Ringwald, M.; Rubin, G. M.; Sherlock, G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25 (1), 25–9. (33) Guda, C.; Subramaniam, S. pTARGET [corrected] a new method for predicting protein subcellular localization in eukaryotes. Bioinformatics 2005, 21 (21), 3963–9. (34) Krogh, A.; Larsson, B.; von Heijne, G.; Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 2001, 305 (3), 567– 80. (35) Ye, J.; Fang, L.; Zheng, H.; Zhang, Y.; Chen, J.; Zhang, Z.; Wang, J.; Li, S.; Li, R.; Bolund, L.; Wang, J. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006, 34 (Web Server issue), W293–7. (36) Fujiki, Y.; Hubbard, A. L.; Fowler, S.; Lazarow, P. B. Isolation of intracellular membranes by means of sodium carbonate treatment: application to endoplasmic reticulum. J. Cell Biol. 1982, 93 (1), 97–102. (37) Molloy, M. P.; Herbert, B. R.; Slade, M. B.; Rabilloud, T.; Nouwens, A. S.; Williams, K. L.; Gooley, A. A. Proteomic analysis of the Escherichia coli outer membrane. Eur. J. Biochem. 2000, 267 (10), 2871–81. (38) Lund, R.; Leth-Larsen, R.; Jensen, O. N.; Ditzel, H. J. Efficient isolation and quantitative proteomic analysis of cancer cell plasma membrane proteins for identification of metastasis-associated cell surface markers. J. Proteome Res. 2009, 8 (6), 3078–90. (39) Khoo, M. L.; McQuade, L. R.; Smith, M. S.; Lees, J. G.; Sidhu, K. S.; Tuch, B. E. Growth and differentiation of embryoid bodies derived from human embryonic stem cells: effect of glucose and basic fibroblast growth factor. Biol. Reprod. 2005, 73 (6), 1147–56. (40) Eriksson, H.; Lengqvist, J.; Hedlund, J.; Uhlen, K.; Orre, L. M.; Bjellqvist, B.; Persson, B.; Lehtio, J.; Jakobsson, P. J. Quantitative membrane proteomics applying narrow range peptide isoelectric focusing for studies of small cell lung cancer resistance mechanisms. Proteomics 2008, 8 (15), 3008–18. (41) Lengqvist, J.; Uhlen, K.; Lehtio, J. iTRAQ compatibility of peptide immobilized pH gradient isoelectric focusing. Proteomics 2007, 7 (11), 1746–52.

PR900597S

Journal of Proteome Research • Vol. 8, No. 12, 2009 5649