Efficient Isolation and Quantitative Proteomic Analysis of Cancer

(23, 24) In this study, we used the NM-2C5/M-4A4 cell system to optimize a protocol for .... Data Analysis, Quantification and Database Searching ...
2 downloads 0 Views 445KB Size
Efficient Isolation and Quantitative Proteomic Analysis of Cancer Cell Plasma Membrane Proteins for Identification of Metastasis-Associated Cell Surface Markers Rikke Lund,†,‡ Rikke Leth-Larsen,†,‡ Ole N. Jensen,§ and Henrik J. Ditzel*,‡,| Medical Biotechnology Center, Institute of Medical Biology, University of Southern Denmark, Winsloewparken 25. 3, DK-5000 Odense C, Denmark, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark, and Department of Oncology, Odense University Hospital, Soendre Boulevard 29, DK-5000 Odense C, Denmark Received December 19, 2008

Cell surface membrane proteins are involved in central processes such as cell signaling, cell-cell interactions, ion and solute transport, and they seem to play a pivotal role in several steps of the metastatic process of cancer cells. The low abundance and hydrophobic nature of cell surface membrane proteins complicate their purification and identification by MS. We used two isogenic cell lines with opposite metastatic capabilities in nude mice to optimize cell surface membrane protein purification and to identify potential novel markers of metastatic cancer. The cell surface membrane proteins were isolated by centrifugation/ultracentrifugation steps, followed by membrane separation using a Percoll/ sucrose density gradient. The gradient fractions containing the cell surface membrane proteins were identified by enzymatic assays. Stable isotope labeling of the proteome of the metastatic cell line by SILAC followed by mass spectrometry analysis enabled identification and quantification of proteins that were differentially expressed in the two cell lines. Dual stable isotopic labels (13C-arginine and 13 C-lysine) instead of a single label (13C-arginine) increased the percentage of proteins that could be quantified from 40 to 93%. Repeated LC-MS/MS analyses (3-4 times) of each sample increased the number of identified proteins by 60%. The use of Percoll/sucrose density separation allowed subfractionation of membranes leading to enrichment of membrane proteins (66%) and reduction from 33% to only 16% of protein from other membranous organelles such as endoplasmatic reticulum, Golgi, and mitochondria. In total, our optimized methods resulted in 1919 protein identifications (corresponding to 826 at similarity level 80% (SL80); 1145 (509 at SL80) were identified by two or more peptides of which 622 (300 at SL80) were membrane proteins. The quantitative proteomic analysis identified 16 cell surface proteins as potential markers of the ability of breast cancer cells to form distant metastases. Keywords: plasma membrane proteins • cell surface proteins • SILAC • protein purification • marker identification

Introduction Membranes have a critical role in cell structure by providing a physical barrier between the cell, the environment and the various subcellular compartments. The cell surface membrane, or plasma membrane, encloses the cell and maintains the essential boundaries between the cytosol and the extracellular environment. The cell surface membrane contains proteins mediating most functions of the membrane, acting as sensors for external signals, transporters of specific molecules and the connection point of the membrane to the cytoskeleton, the * To whom correspondence should be addressed. Telephone: +4565503781. Fax: +4565503922. E-mail: [email protected]. † These authors contributed equally to this work. ‡ Medical Biotechnology Center, Institute of Medical Biology, University of Southern Denmark. § Department of Biochemistry and Molecular Biology, University of Southern Denmark. | Department of Oncology, Odense University Hospital.

3078 Journal of Proteome Research 2009, 8, 3078–3090 Published on Web 04/02/2009

extracellular matrix and adjacent cells. The proteins constitute approximately 50% (by mass) of the cell surface membrane.1 Defining the membrane proteome, especially that limited to the cell surface membrane, is of great interest due to the fundamental role of membrane proteins. Moreover, profiling cell surface markers in specific cell types and at a specific differentiation or disease stage has great potential for identifying novel molecular markers and subsequent therapeutic targets that may be recognized by, for example, monoclonal antibodies and small biological molecules.2,3 Proteomic analysis usually consists of protein separation followed by protein identification, which can be performed either by two-dimensional gel electrophoresis followed by MALDI-TOF-MS or by LC-MS/MS. In the classical proteomic approach that combines two-dimensional gel electrophoresis and MALDI-TOF-MS, the proteins are separated according to isoelectric point and molecular mass prior to mass spectro10.1021/pr801091k CCC: $40.75

 2009 American Chemical Society

research articles

Cell Surface Protein Isolation and Identification 4

metry. This technique is useful for protein quantification and detection of post-translational modifications4 but has limited use in identification of membrane proteins, including cell surface membrane proteins. Membrane proteins are hydrophobic and therefore only soluble in detergent containing buffers, tending to precipitate at their isoelectric point in the gels. This group of proteins is furthermore lower in abundance, making subcellular fractionation and directed biochemical enrichments important. For LC-MS/MS, the proteins are digested prior to the reverse phase peptide separation and sprayed directly into the mass spectrometer, eliminating the problem of keeping hydrophobic proteins soluble. Protein levels are of major importance when performing a proteomic analysis, as larger amounts of some proteins may hinder the detection of more infrequent proteins, for example, cell surface membrane proteins. Fractionation of crude extracts increases the likelihood of detecting proteins of lower abundance. This can be accomplished by separating the mixture into subcellular fractions and organelles, or enriching from larger volumes by selective fractionation, immunoprecipitation, chromatographic or electrophoretic methods.2,4 Several strategies have been employed for enrichment of cell surface membranes. In general, most membrane preparation procedures are initiated by a step where the cells or tissue are incubated in hypotonic buffer followed by mechanical homogenization and removal of nuclei and cell debris by centrifugation at low speeds.3,5-12 Other alternative purification approaches includes lysing cells directly in the cell culture flask and sequentially recovering the basolateral cell membranes.13 The membranes in the postnuclear supernatant can, due to different lipid-to-protein ratios in the different cellular membranes, be separated either in a discontinuous sucrose gradient,7 or on a 35% sucrose cushion.6,12 Alternatively, a crude membrane fraction containing all membrane types can be obtained by sedimenting the membranes by ultracentrifugation.3,5,8,9,11 Further purification can be achieved by combining sedimentation with a discontinuous sucrose gradient10 or Percoll/sucrose density separation,5,11 both separating the membranes of different organelles according to their different densities.14 Since most disease-associated markers are not exclusively expressed in either the disease state or the “healthy state”, a procedure for quantification of potential protein expression differences needs to be included for marker identification strategies. One approach is stable isotope labeling by amino acids in cell culture (SILAC), which can be combined with LC-MS/MS.15 In short, two cell populations are propagated at similar growth conditions with the exception that one media contains one or several “heavy” essential amino acids, such as 13 C-arginine or 13C-lysine,16,17 while the other media contains normal (or “light”) amino acids. The incorporation of the heavy amino acid into a peptide leads to a distinct increase in mass compared to the peptide containing the normal amino acid, but no other changes in the chemical properties of the peptides, the biological function of the protein or the appearance of the cells. After a low number of cell doublings, complete labeling is achieved even for proteins with no significant turnover, and the cells can be mixed in a 1:1 ratio and their (sub)-proteomes can be extracted, quantified and compared.18-20 The development of metastasis is a complex, multistep process involving cell-cell and cell-matrix adhesion, which is responsible for dissemination of cells from a primary tumor and interaction with tissue at the distant site, as well as

degradation of extracellular matrix by proteinases and the initiation and maintenance of early growth at new sites.21,22 Many of these initial steps in the development of metastasis cannot be detected using patient material or simple in vitro assays. However, an in vivo model based on transfer of cell lines to mice may allow the study of the metastatic process and enable comparative molecular screening and functional evaluation of candidate metastasis-related genes and proteins.23 This model is based on the isogenic cell lines NM-2C5 and M-4A4, which were selected from a panel of 80 tumor subclones derived by serial dilution of the metastatic, breast carcinoma cell line MDA-MB-435. These subclones were systematically tested for metastatic behavior in nude mice, and while the M-4A4 and NM-2C5 cell lines were found to be equally tumorigenic, M-4A4 cells formed metastasis in the lung and lymph nodes and NM-2C5 cells, although capable of spreading to the lungs, remained dormant and did not form metastases. Thus the model recapitulates the very early steps that allow cancer cells to start proliferation at distant sites, circumvents the problems of variance of genetic background between tissue samples, and abrogates the difficulties in identifying those cells in a tumor mass that are capable of metastasizing.23,24 In this study, we used the NM-2C5/M-4A4 cell system to optimize a protocol for efficient isolation of SILAC-labeled cell surface membrane proteins, thereby allowing identification of metastasis-associated surface markers by quantitative and comparative mass spectrometry.

Materials and Methods Cell Culture and SILAC Reagents. The human breast carcinoma cell lines, NM-2C5 and M-4A4, kindly provided by Dr. David Tarin, UCSD Cancer Center, CA, were propagated in custom-made Dulbecco’s Modified Eagle’s Medium (DMEM) lacking L-arginine, L-lysine, and L-glutamine (JRH Bioscience, Lenexa, KS) supplemented with 28 mg/L L-arginine, 75 mg/L L-lysine, 580 mg/L L-glutamine (Sigma-Aldrich, St. Louis, MO), 10% triple 0.1 µm dialyzed FCS (HyClone, Logan, UT), and 1% penicillin/streptomycin (Invitrogen, Carlsbad, CA). The Larginine concentration was reduced to a third of the concentration in normal DMEM to avoid arginine-to-proline conversion artifacts. The medium of the NM-2C5 cell line was supplemented with 12C6-L-arginine (Sigma-Aldrich) and 12C6L-lysine (Sigma-Aldrich) for all strategies, whereas the medium of the M-4A4 cell line was supplemented with 13C6-L-arginine and 13C6-L-lysine (Cambridge Isotope Laboratories, Andover, MA) or either of the two 13C6-amino acids and one 12C6-amino acid. To ensure complete incorporation of the stable isotopes, the cells were grown at least five passages, corresponding to more than nine doublings, in the custom-made media before purifying cell surface membrane proteins. The incorporation of stable isotopes was monitored by MS, as described below. The cells were maintained at 37 °C in a humidified atmosphere of 95% ambient air and 5% CO2 and passaged every second day at a ratio of 1:3. Proteome analysis was performed on cultures passaged no more than ten times from frozen stock vials designated passage 1 at the time of in vivo inoculation, thereby ensuring genetic stability. Cell Surface Membrane Protein Purification. All steps were performed at 4 °C or on ice and all buffers contained Complete Protease Inhibitor (Roche Diagnostics, Mannheim, Germany). Membrane Purification Procedure 1. Cells were rinsed in PBS, removed from the flasks using a cell scraper and the two cell lines were counted and mixed in a 1:1 ratio (a total of ∼2 Journal of Proteome Research • Vol. 8, No. 6, 2009 3079

research articles × 10 cells), followed by two rounds of washing in PBS. Cells were lysed by incubation in a hypotonic buffer (10 mM Trisbase, 1.5 mM MgCl2, 10 mM NaCl, pH 6.8) for 5 min followed by sedimentation by centrifugation (300 × g, 5 min), resuspended in gradient buffer (0.25 M Sucrose, 10 mM HEPES, 100 mM Succinic acid, 1 mM EDTA, 2 mM CaCl2, 2 mM MgCl2, pH 7.4) and homogenized by 50 strokes (1500 rev/min) using a motor-driven Potter homogenizer (B. Braun Biotech, Allentown, PA). The homogenate was centrifuged at 1,000 × g for 10 min and the supernatant, free of cell debris and nuclei, was collected. An aliquot (homogenate (H)-fraction) was collected for later analysis by enzymatic assays (as described below). Subsequently, the supernatant was centrifuged at 100 000 × g for 30 min in a M150GX microultracentrifuge (Sorvall, Asheville, NC) using a S55-S swing-out rotor (Sorvall), acceleration/ deceleration: 9/7. The pellet, containing crude membranes, was resuspended in gradient buffer and sedimented twice using centrifugation conditions as described above. An aliquot (membrane (M)-fraction) was collected for later analysis. Percoll/Sucrose Density Membrane Separation. The purified and washed membranes were resuspended in 2 mL gradient buffer by manual homogenization (5 strokes) and mixed with 1.9 mL Percoll (Amersham Biosciences, Uppsala, Sweden) containing 10% PBS and 0.19 mL 2.5 M sucrose in an Easy-Seal tube (polyallomer, 5 mL, Sorvall). The tube was filled with gradient buffer, capped and centrifuged at 120 000 × g for 15 min in a M150GX microultracentrifuge using a fixedangle rotor S100-AT6 (acceleration/deceleration: 9/1). The gradient was fractionated from the top into ten fractions of 235 µL by displacing it from the bottom with 2 M sucrose. Membrane Purification Procedure 2. Cells were harvested, mixed and incubated in a hypotonic buffer as described in membrane purification procedure 1. The cells were resuspended in lysis buffer (255 mM Sucrose, 20 mM HEPES, 1 mM EDTA, pH 7.4) and homogenized by 50 strokes (1500 rev/min) using a motor-driven Potter homogenizer. The homogenate was centrifuged at 20 000 × g for 10 min, and the supernatant was collected. An aliquot (H-fraction) was collected for later analysis. The supernatant was centrifuged at 240 000 × g for 2 h in a M150GX microultracentrifuge using a S100-AT6 fixedangle rotor (Sorvall), acceleration/deceleration: 9/7. The pellet containing crude membranes was resuspended in PBS. Assays. The protein concentration and the activities of γ-glutamyl transpeptidase (GGTP, a cell surface marker) and succinate dehydrogenase (SDH, a mitochondrial marker) were determined in the individual fractions of the Percoll/sucrose density gradient. Gamma-Glutamyl Transpeptidase-Activity. Aliquots of each fraction (50 µL) were transferred to a microtiterplate and mixed with 150 µL substrate solution (1 mM L-gamma-glutamyl-pnitroanilide and 20 mM glycylglycine in 0.1 M Tris-HCl pH 7.6) and the relative concentration of 4-nitroaniline was measured 10 times at 405 nm over 15 min at 37 °C using a Victor3 Multilabel Plate Reader (PerkinElmer, Shelton, CT). The change in absorbance/min, that is, the relative GGTP-activity, was calculated on a linear segment of the measurements. Succinate Dehydrogenase-Activity. Aliquots of each fraction (20 µL) were transferred to a microtiterplate and mixed with 80 µL Milli-Q H2O and 100 µL stock solution (0.1 M phosphate buffer (Na2HPO4 and KH2PO4), 0.1 M sodium succinate, 0.05 M sucrose, pH 7.4) containing 2 mg/mL p-iodonitrotetrazolium (Sigma-Aldrich). The relative concentration of INT formazan was measured 10 times at 490 nm over 5 min at 37 °C (Victor3 7

3080

Journal of Proteome Research • Vol. 8, No. 6, 2009

Lund et al. Multilabel Plate Reader) and the change in absorbance/min, that is, the relative SDH-activity, was calculated on a linear segment of the measurements. Percoll Removal and Protein Concentration Measurement. To avoid polymeric background peaks from Percoll in the mass spectra and interference with the protein concentration measurement assay, Percoll was removed by centrifugation at 900 000 × g for 15 min at 4 °C in a M150GX microultracentrifuge using a fixed-angle rotor S150-AT (acceleration/ deceleration: 9/7). The protein concentration was determined in triplicate by using a colorimetric, detergent-compatible, Lowry-based assay (DC protein assay, Bio-Rad, Hercules, CA) in accordance with the manufacturer’s protocol using a BSA preparation as standard (Pierce, Rockford, IL). Enzymatic Digestion. The fractions enriched for cell surface proteins and with minimal mitochondrial contamination were either pooled or processed individually. As little as 5 µg digested protein was used per LC-MS/MS analysis, and each analysis repeated up to four times per sample. The proteins were digested as described by Nielsen and colleagues11 with minor modifications. In brief, the cell surface membranes, containing integral and associated membrane proteins, were sedimented by centrifugation at 900 000 × g for 15 min at 4 °C and resuspended in a carbonate treatment buffer (100 mM Na2CO3, 10 mM DTT, pH 11.5). After 30 min of incubation at room temperature, the membranes were sedimented by centrifugation as described above and resuspended in wash/reductionbuffer (0.2 M NaBr, 0.2 M KCl, 50 mM Tris-HCl, 10 mM DTT, pH 8.0). The incubation and sedimentation procedure was repeated once. The membranes were solubilized and incubated in a urea buffer containing 6 M Urea, 2 M Thiourea, 100 mM Tris-HCl, 10 mM DTT, pH 8.0 for 10 min followed by addition of IAA at a final concentration of 100 mM and incubation in the dark for 30 min at room temperature. Subsequently, the proteins were digested by lysyl endopeptidase (Lys-C) for 3 h (2 µg enzyme/100 µg protein, Achromabacter lyticus, Wako Pure chemicals, Osaka, Japan). The Lys-C digests were diluted 7-fold with digestion buffer (50 mM Tris-HCL, 1 mM CaCl2, pH 7.6) and further digested with trypsin (0.5 µg enzyme/100 µg protein, sequence grade, Promega, Madison, WI) overnight at 37 °C. Membranes and undigested proteins were sedimented by centrifugation as described above. The peptides present in the supernatant were recovered and vacuum-dried to a volume of 20-50 µL. The peptides were acidified by adding 1 µL 10% TFA. Sample Desalting and Up Concentration. The tryptic peptides were desalted and concentrated on Empore C18 extraction disk (3M, St. Paul, MN) and reverse-phase POROS R3 (Perseptive Biosystems, Foster City, CA) packed in GELoader tips (Eppendorf, Hamburg, Germany) using a modified version of that reported by Rappsilber and colleagues.25,26 In brief, extraction disks were placed in the GELoader tips serving as both frit and column material and the reverse-phase POROS R3 was loaded above, forming a larger column. The column material was washed in 70% ACN and equilibrated in 0.1% TFA prior to loading the peptides. The peptides were washed on the column with 0.1% TFA, eluted with 50% ACN and dried by vacuum centrifugation. Liquid Chromatography-Tandem Mass Spectrometry. Samples were analyzed by nanoflow LC-MS/MS. Peptide separation was achieved by using an LC-Packings Ultimate 3000 nanoflow system (LC Packings, Amsterdam, The Netherlands). Peptides were loaded with a flow rate of 3 µL/min onto a

research articles

Cell Surface Protein Isolation and Identification custom-made 1 cm precolumn (75 µm inner diameter) of fused silica with kasil-frits retaining the Reprosil C18, 3.5 µm reversedphase particles (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). Nanoflow reversed-phase HPLC was then performed with a flow of 0.1 µL/min through a custom-made 8 cm analytical column (50 µm inner diameter), packed with Reprosil C18, 3.5 µm reversed-phase particles. Peptides were eluted directly into the ESI source of a Q-TOF Micro tandem mass spectrometer (Waters/Micromass, Manchester, UK) using a stepped linear gradient (solvent A: 0.5% CH3COOH, solvent B: 80% ACN, 0.5% CH3COOH): 0-10% B for 5 min, 10-50% B for 85 min, and 50-100% B for 5 min. Mass- and charge-dependent collision energies were used for peptide fragmentation. Biological and Technical Replicates. A total of 16 membrane preparations were included in this study. A single isotopic label strategy (strategy A) using 13C-arginine was performed along with membrane purification procedure 1 in three replicates (sample A1-3). The effect of minor variations of strategy A (see results section) was analyzed in two unrepeated experiments (Aa and Ab). A dual isotopic label strategy (strategy B) using 13 C-arginine and 13C-lysine was performed along with membrane purification procedure 1 in three replicates (sample B1-3). Repeated LC-MS/MS analysis (technical replicates, strategy C) was performed in combination with a dual isotopic label strategy and membrane purification procedure 1 in three replicates (sample C1-3). One sample was prepared as described for strategy C, but without the density membrane separation (Ca). Repeated LC-MS/MS analysis of the individual gradient fractions from the density gradient (strategy D) was performed along with a dual isotopic label strategy and membrane purification procedure 1 in two replicates (sample D1-2). Membrane purification procedure 2 was repeated twice: once using a single isotopic label strategy and once using a dual isotopic label strategy (procedure 2.1 and 2.2). Nine samples were analyzed once by LC-MS/MS (strategy A, B and procedure 2.1), while two samples were analyzed twice by LC-MS/MS (strategy D), one sample was analyzed three times (sample C2) and four samples were analyzed four times (Sample C1, C2, Ca and procedure 2.2). Data Analysis, Quantification and Database Searching. The raw LC-MS/MS data obtained was processed by the MassLynx 4.0 software (Waters, Milford, MA) and retention times were extracted by the VEMS program ExRaw.exe. The processed data were analyzed by the freeware VEMS V3.207.27,28 Data from sequential LC-MS/MS analyses of the technical replicates were analyzed as one batch. Protein Identification. The data was searched against all human proteins (n ) 66 620) in the International Protein Index (IPI) version 3.23.29 Default VEMS-settings were used for identification. Computational methods, scoring function and statistical evaluation are described in details by Matthiesen and colleagues.28 In brief, cutoff score for accepting individual MS/ MS spectra was set to 10 to initially include as many protein identifications as possible. The MS/MS spectra of all singlepeptide-based protein identifications were manually inspected, and protein identifications based on poor MS/MS spectra (e.g., spectra with few matching b- and y-ions or assignment of both heavy and normal arginine and/or lysine in the same peptide) were excluded. The false discovery rate was calculated as the number of proteins identified using a reversed database (false positives) divided by the sum of proteins identified (true positives identified using the IPI and the false positives identified using IPI reversed).30 The false discovery rate was estimated

to be 3.9% using the IPI version 3.23 reversed made using the Decoy Database Builder software.30 Mass accuracy of precursor and fragment ions were set to (0.5 Da, and the mass accuracy of the peptides identified were generally below 50 ppm, with an average of approximately 30 ppm, furthermore lowering the mass accuracy did not influence the false discovery rate significantly (data not shown). The enzyme specificity setting was cleavage C-terminal to arginine and lysine and one missed cleavage allowed. Method-specific settings were 1) cysteine modification: Carbamidomethylation (CAM), 2) variable modifications whole database: M_oxidation, R_6 × 13C and K_6 × 13C. Quantification. Default VEMS-settings were used for quantification. The computational methods, scoring function and statistical evaluation used by this software has been described in details in Matthiesen and colleagues.28 In brief, the cutoff score of the peptides used for quantification was set to 20, mass accuracy to (0.3 Da and method-specific settings were “Labeled amino acids”: R_6 × 13C and K_6 × 13C. For a peptide pair to be quantified at least one peptide had to meet the criteria of the cutoff score, while the other peptide should have a count above 5 (or meet the criteria of the cutoff score). Data points with lower counts than 5 were removed, while no outlier data points were removed. The VEMS peptide quantification values (QV, eqs 1 and 2) were based on calculation of quantification values for each set of the first three peaks in the light and heavy isotopic clusters (eq 1): QVpeptide )

QV1st set of peaks + QV2nd set of peaks + QV3rd set of peaks 3 (1)

The maximum standard deviation of the peptide quantification value accepted for a protein quantification was 5. The VEMS quantification value for any given set of peaks in the isotopic cluster was calculated as the peak intensity (I, eq 2) of the heavy isotope divided by the total peak intensity of the heavy and the light isotope from multiple scans (the number of scans depending on the peak intensity, eq 2): QV )

IHeavy IHeavy + ILight

(2)

This means that a protein expressed in equal amount by both cell lines has a quantification value of 0.5 or 50%. For each sample, the quantification values of β-Actin (UniProt number P60709) and Na+/K+-ATPase (UniProt number P05023) were examined and the average quantification value was calculated. When the average quantification value deviated more than (6 from 50%, quantifications in the given data set were normalized to ensure comparable results between experiments. A quantification value-dependent normalization function, which is deduced in Supplementary Data 1 in the Supporting Information, was calculated for each sample to be normalized. A protein expressed in only NM-2C5 has a quantification value of 0%, and a protein expressed in only M-4A4 has a quantification value of 100%. A cutoff of a 2-fold difference in expression levels was chosen to ensure that the markers identified had biological relevance corresponding to 33% and 67% for NM2C5 and M-4A4, respectively. For marker identification, cutoff values of 40% for proteins with lower expression in M-4A4 compared to NM-2C5, and 60% for proteins with higher expression in M-4A4 compared to NM-2C5 were used to allow Journal of Proteome Research • Vol. 8, No. 6, 2009 3081

research articles

Lund et al.

Figure 1. Overview of cell surface membrane purifications based on membrane purification procedures 1 (strategies A-D) and 2. Initially, a single isotope label was tested (strategy A and once using procedure 2), but only a low quantification percentage was obtained and an additional isotope label was introduced (dual isotope labeling, strategies B-D and once using procedure 2). To increase the number of proteins identified, the samples were analyzed repeatedly by LC-MS/MS (C-D and once using procedure 2). To analyze the distribution of the proteins from different cellular compartments in the density gradient, the gradient fractions was analyzed individually (strategy D). Membrane purification procedures 1 and 2 are described in Figure 2.

for a general margin of error since the true signal is often underestimated due to signals from background “noise” or saturation of the detector. For a protein to be considered differentially expressed, it should meet the following criteria: (1) the proteins should be identified in two or more samples and be identified by two or more peptides in at least one of these samples, (2) protein quantifications should be based on two or more different peptides, and at least one protein quantification should be above 60% or below 40%. Proteins that did not consistently exhibit differential expression despite fulfilling the above criteria were excluded. All statistical analyses were performed using Intercooled Stata version 8.2 (http://www.stata.com). Differences in protein expression were analyzed with t-test to determine whether the mean quantification value was different from 50%. ProteinCenter. The protein lists generated by VEMS was transferred to ProteinCenter (Professional Edition, version 1.1.2-1.3.4, Proxeon, Odense, Denmark). To compare the different methods, the proteins were sorted in ProteinCenter according to cellular localization (membrane and ER, Golgi, and mitochondria (EGM), based on gene ontologies) and descriptive statistics were obtained. To reduce redundancy the identified proteins were clustered into groups according to similarity level 95%, 80%, and 65% (SL95, SL80, and SL65, respectively), but all proteins matching the data equally well are reported in Supplementary Data 2 and 3 in the Supporting Information. For marker identification, the proteins were sorted according to the number of samples they were identified in and to the quantification. Validation of Isotope Incorporation. The incorporation of stable isotopes in M-4A4 using a dual labeling strategy was 3082

Journal of Proteome Research • Vol. 8, No. 6, 2009

monitored from passage three through eight. Unlabeled cells were used to check the amount of background noise to be expected. An aliquot of 5 × 106 to 107 cells was washed twice in ice-cold PBS containing Complete Protease Inhibitor and lysed on ice by homogenization: 20 strokes (1500 rev/min) using a motor-driven Potter homogenizer. The homogenate was sedimented by centrifugation at 900 000 × g for 15 min at 4 °C and the pellet was resuspended in a wash/reduction-buffer. After 30 min incubation at 50 °C, the sedimentation was repeated. The pellet was resuspended and incubated 10 min at room temperature in urea buffer followed by addition of IAA at a final concentration of 100 mM and incubation in the dark for 60 min. The sedimentation was repeated and the pellet was resuspended in digestion buffer with trypsin (0.5 µg trypsin/ 100 µg protein) and incubated overnight at 37 °C. Membranes and undigested proteins were sedimented by centrifugation, peptides were desalted and concentrated, and analyzed by LC-MS/MS, as described above, with the exception that the 10-50% organic solvent gradient was only of 30 min duration. Proteins were identified and quantified by the VEMS 3.0 software.

Results Several optimization steps were evaluated to develop a method allowing isolation of relatively pure membrane preparations for subsequent identification and quantification of large panels of cell surface membrane proteins by LC-MS/MS. The quantifications, based on SILAC, were used for comparative analysis of the expression levels of proteins in the two isogenic breast cancer cell lines NM-2C5 and M-4A4. As outlined in Figure 1, five different variables for optimization of cell surface membrane protein purification and marker

Cell Surface Protein Isolation and Identification

research articles

Figure 3. Representative example (sample C2 from the repeated LC-MS/MS analysis study) of the result from the γ-glutamyl transpeptidase (GGTP) and succinate dehydrogenase (SDH) assays analyzing the homogenate (H)-fraction, the membrane (M)-fraction and density gradient fractions. Fractions 1-5 exhibited high relative GGTP activity (dark gray), whereas fractions 6-10 exhibited high relative SDH activity (light gray). Fractions 1-5 were therefore selected for further processing. Figure 2. Membrane purification procedures 1 and 2. Both procedures include cell lysis using mechanical homogenization combined with a hypotonic buffer and removal of the nuclei and cell debris by centrifugation. Procedure 1 enables efficient separation of membranes from different cellular compartments in a Percoll/sucrose density gradient, whereas procedure 2 produces a crude membrane extract.

identification were evaluated, and each strategy was tested in its entirety multiple times. The four strategies A to D included SILAC, membrane purification procedure 1, membrane separation, enzymatic digestion and LC-MS/MS analysis. An alternative strategy based on membrane purification procedure 2 was also evaluated. The individual steps of membrane purification procedures 1 and 2 are outlined in Figure 2. All LC-MS/MS-data were processed in VEMS (supporting data, including accession numbers, standard deviation and normalized quantification values, all identified and quantified peptides, sequence coverage, precursor m/z observed, precursor charge observed, scores and E-values for all proteins and peptides is provided in the Supplementary Data 2 and 3 files (see Supporting Information). Assigned and manually verified spectra for all single-peptide-based protein identifications are provided in Supplementary Data 4 in the Supporting Information). The protein identifications were annotated and gene ontologies and descriptive statistics were calculated using the ProteinCenter software. The data obtained was processed in two ways: To optimize the cell surface membrane protein purification steps, the strategies were analyzed and compared, whereas the data from the different strategies were analyzed together to identify a panel of novel markers preferentially expressed in one of the two cell lines, thus potentially associated with the ability of cancer cells to metastasize. Stable Isotope Labeling by 13C6-Arg. Initially, a single isotopic label strategy (strategy A) using 13C-arginine was performed along with membrane purification procedure 1. The fractions with the highest content of cell surface proteins and lowest content of mitochondrial proteins were identified by enzymatic assays for γ-glutamyl transpeptidase (GGTP, a cell surface marker) and succinate dehydrogenase (SDH, a mito-

chondrial marker) (Figure 3). These fractions were pooled and the membrane-associated structural proteins were removed by carbonate treatment prior to enzymatic digestion. The peptides were analyzed once by LC-MS/MS and the data obtained was processed in VEMS and ProteinCenter. Using this strategy, 534 proteins (corresponding to 200 at similarity level 80% (SL80)) were identified (n ) 248 on average pr sample, n ) 92 at SL80), but only 42% were membrane proteins while 19% were proteins from ER, Golgi and mitochondria (EGM, Table 1). Moreover, only 40% of the proteins could be quantified. For comparison, the pool of fractions with high content of mitochondrial proteins, 7-8, were also analyzed (Supplementary Table 1, Aa, Supporting Information). The pool contained a lower total number of proteins (n ) 193, n ) 73 at SL80) and, as expected, a higher percentage of proteins from EGM (29%). Furthermore, the effect of omitting the carbonate treatment, which removes structural proteins and proteins only weakly associated with the membrane, was tested (Supplementary Table 1, Ab, Supporting Information) resulting in a lower number of proteins (n ) 119, n ) 41 at SL80) being identified. Double Isotopic Labeling Using 13C6-Arg and 13C6-Lys. Dual labeling using 13C-arginine and 13C-lysine was introduced in strategy B in an attempt to increase the percentage of quantified proteins. Although only 329 proteins (n ) 138 at SL80) were identified, 93% of these proteins could be quantified (Table 1). Furthermore, we found that 61% of the identified proteins were membrane proteins, while only 20% were proteins from EGM. Repeated LC-MS/MS Analyses. In attempt to increase the number of identified proteins, each sample was analyzed 3-4 times by LC-MS/MS in strategy C. A higher total number of proteins, n ) 525 compared to n ) 329 in strategy B, corresponding to an increase of 60% (n ) 181 and n ) 106 at SL80, respectively), was identified compared to strategy B, and 84% of the identified proteins could be quantified (Table 1). Furthermore, 66% of these proteins were membrane proteins, while only 16% were proteins from EGM. For comparison, the effect of omitting the Percoll/sucrose density gradient was tested (Supplementary Table 1, Ca, Supporting Information) and resulted in identification of a high number of proteins (n Journal of Proteome Research • Vol. 8, No. 6, 2009 3083

research articles

Lund et al.

Table 1. Descriptive Statistics of the LC-MS/MS Data Obtained from the Membrane Purification Procedure 1 Based Strategies A

B

Db

C

A-D

strategy

average (std)a

totala

average (std)

total

average (std)

total

average ((dev.)

total

total

Identified proteinsc Proteins quantifiedd Proteins quantified (%)d Membrane proteinse Membrane proteins (%)e Proteins from EGMe Proteins from EGM (%)e Proteins, M-4A4 > NM-2C5f Proteins, M-4A4 > NM-2C5 (%)f Proteins, M-4A4 < NM-2C5f Proteins, M-4A4 < NM-2C5 (%)f SL95g % of identified proteins SL80g % of identified proteins SL65g % of identified proteins

247.7 (153.3) 91.3 (75.9) 36.9 (12.8) 102.7 (78.7) 37.0 (11.1) 40.0 (39.5) 16.2 (25.8) 11.0 (14.7) 3.5 (3.2) 15.0 (14.0) 5.7 (2.3) 129 (92.2) 48.7 (8.3) 92.3 (66) 35.2 (5.1) 81.3 (61.5) 30.1 (6.7)

534 216 40.4 225 42.1 102 19.1 33 6.2 35 6.6 289 54.1 200 37.5 172 32.2

138.0 (101.7) 130 (91.2) 95.5 (4.5) 83.0 (47.1) 66.5 (13.0) 27.0 (24.3) 19.6 (23.9) 6.7 (5.9) 8.3 (10.8) 9.7 (9.9) 7.4 (4.6) 65.7 (57.0) 48.2 (14.1) 44.3 (44.8) 30.8 (12.1) 36.7 (40.2) 25.8 (14.0)

329 306 93.0 202 61.4 64 19.5 20 6.1 22 6.7 158 48.0 106 32.2 87 26.4

255.3 (141.6) 226.0 (99.8) 92.1 (9.8) 166.0 (75.3) 67.5 (6.6) 41.0 (27.0) 16.1 (19.1) 18.0 (16.1) 6.0 (3.4) 12.3 (9.3) 4.7 (1.7) 119.7 (96.0) 42.9 (11.0) 78.0 (75.2) 26.2 (11.8) 60.3 (64.2) 19.5 (11.3)

525 439 83.6 348 66.3 84 16.0 49 9.3 27 5.1 269 51.2 181 34.5 145 27.6

331 ((71.5) 310 ((68) 93.4 ((0.4) 189.0 ((60.0) 55.7 ((6.1) 61.5 ((17.5) 18.6 ((1.1) 22 ((10) 7.6 ((4.7) 69.5 ((37.5) 19.4 ((7.1) 168.5 ((24.5) 51.6 ((3.7) 109.5 ((16.5) 33.5 ((2.3) 92 ((12) 28.3 ((3.7)

510 479 93.9 288 56.5 93 18.2 41 8.0 116 22.7 257 50.4 161 31.6 139 27.3

958 709 74.0 512 53.5 154 16.1 113 11.8 148 15.5 528 55.1 346 36.1 287 30.0

a Each strategy tested by multiple independent samples includes the total values (per strategy) and average values (per sample) and the standard deviation (strategies with three independent samples) or the deviation (strategies with two independent samples) are included. b Data from strategy D is based on two membrane purifications, whereas the data from strategy A-C is based on three purifications. c Number of proteins identified by VEMS, including all the different entries identified from the IPI-database, for example, two different isoforms of one protein were included as two different IPI entries even though these might be identified by the same peptides. d Percentage of quantified proteins is calculated from the number of identified proteins. e Number of membrane proteins and proteins from membranous organelles: endoplasmatic reticulum, Golgi and mitochondria (EGM). f Number (and percentage) of proteins with expression levels significantly higher/lower in M-4A4 vs NM-2C5. g Redundancy of the identified proteins is reduced by clustering at similarity level (SL) SL95, SL80, and SL65. These levels give an indication of how many different proteins (not IPI entries) were identified. EGM: endoplasmatic reticulum, Golgi and mitochondria.

) 765, n ) 363 at SL80), over half-being membrane proteins (61%), but the percentage of proteins from EGM (34%) was also high. Separate Analysis of the Gradient Fractions. Strategy D evaluated whether analysis of each fraction individually from the gradient separation instead of the pooled fractions would further increase the number of identified proteins and the percentage of membrane proteins. Approximately the same overall number of proteins were identified in strategies D and C, (n ) 510 vs 525, n ) 161 vs n ) 181 at SL80), but while the 525 identified proteins using strategy C were obtained by analysis of three samples an equal number of protein were identified using strategy D from analysis of only two samples. An average of 331 proteins (n ) 110 at SL80) was identified per sample using strategy D compared to only 255 proteins (n ) 78 at SL80) per sample in strategy C. Moreover, the number of quantified proteins was slightly increased in strategy D vs strategy C (479 (n ) 310 at SL80) and 439 (n ) 226 at SL80), respectively) (Table 1). Analysis of the subcellular distribution of the proteins in each of the fractions identified to have a high content of plasma membrane proteins by the enzymatic assays demonstrated that while the cell surface-associated proteins are present throughout the centrifugation gradient, the proteins associated with mitochondria, ER, and Golgi exhibit individual patterns of distribution (Figure 4). Proteins associated with mitochondria, which is the greatest source of “contaminating” proteins in the membrane preparations, were identified in the last part of the gradient, especially in fractions 7 and 8. Proteins associated with ER and Golgi, which generally represent minor “contaminants”, peaked in fraction 4-5 and 6, respectively. Detailed analysis of the proteins identified in each fraction revealed that different proteins clustered in distinct fractions (Table 2) likely due to common biochemical characteristics, for example, were the cell surface glycoproteins ICAM and MCAM only identified in fraction 7 and the ribosomal/ER proteins Ribosomal protein L12 variant and 1-acyl-sn-glycerol-3-phosphate acyltransferase alpha only identified in fraction 5. 3084

Journal of Proteome Research • Vol. 8, No. 6, 2009

Figure 4. Distribution of proteins from different cellular compartments in the density gradient (sample D1). The number of cell surface proteins (() peaked in fractions 3 and 4, but cell surface proteins were distributed throughout the gradient. The number of proteins from Golgi (∆) peaked in fraction 6. The number of mitochondrial proteins (×) was highest in fraction 8. Proteins from endoplasmatic reticulum (9) were low in number and peaked in fractions 3-4.

Membrane Purification Procedure 2. Membrane purification procedure 2 was tested twice: once with a single isotopic label and once with dual isotopic labels. Membrane purification procedure 2 included sedimentation of the membranes at a higher centrifugation speed and omission of Percoll/sucrose density gradient separation. A high number of proteins were identified by this strategy ranging from 495 to 975 (n ) 186 to n ) 449 at SL80) (Table 3). The percentage of identified membrane proteins varied from 47 to 70%, but a high percentage of proteins from EGM was also identified (25%). Evaluation of the Protein Profiles Identified by the Different Strategies. To estimate the cell surface protein purity, we compared the membrane proteins/EGM protein ratios obtained from the different protein fractionation strategies. This analysis revealed that the highest ratio was observed using strategy C (ratio ) 4), which included dual isotopic labels and the membrane purification procedure 1, Percoll/sucrose density gradient, and sequential LC-MS/MS-analysis (Figure 5).

research articles

Cell Surface Protein Isolation and Identification Table 2. Membrane Proteins Identified in the Individual Percoll/Sucrose Gradient Fractions (Sample D1)

The membrane protein/EGM protein ratios for strategies A, B and D were 2.6-3.1, whereas the ratio of the membrane purification 2-based strategy was only 2.1. Comparison of the specific proteins identified from the different strategies revealed that although the overall number of membrane proteins identified using the strategy without the density gradient separation were higher, some cell surface proteins were only identified when including the density gradient, for example, CD47, CD81, Integrin β3, CD109 and G-protein alpha subunits. Combined Data Analysis of All the Proteomic Analyses. A total of 1919 different proteins (by IPI-entries, n ) 826 at SL80) were identified by combining the data set obtained from the separate proteomic analyses. Of these, 1145 (n ) 509 at SL80) were identified by two or more peptides, 1425 (n ) 623 at SL80) could be quantified, 1063 (n ) 481 at SL80) were membrane proteins and 932 (n ) 369 at SL80) were identified in two or more membrane purifications. Five-hundred-and-twenty-eight (528, n ) 224 at SL80) membrane proteins were identified in two or more samples (Table 4) and 622 (n ) 300 at SL80) were membrane proteins identified by two or more peptides. Similarity Levels. One of the difficulties in determining the number of proteins identified using different proteomic approaches is that identification of one or more peptides may match the amino acid sequence of a conserved region present in multiple proteins that may be more or less similar. We included a filter to eliminate proteins with high sequence similarity, thereby clustering proteins (and counting these as one protein) according to similarity levels (SL) with 95% homology (SL95), 80% homology (SL80), and 65% homology

(SL65). Clustering at SL95 reduced the number of proteins to 43-58% of the total number of identified proteins, while clustering at SL80 reduced the number to 26-48% and clustering at SL65 reduced the number to 20-44% of the total. To evaluate which of the clustering levels that provided the best estimate of the number of proteins actually identified by mass spectrometry analysis, the “over” and “under” clustering at SL95, SL80 and SL65 was estimated, as depicted in Figure 6. The number of clusters containing more than one gene, representing “over” clustering, and the number of genes in more than one clusters, representing “under” clustering, were counted. The numbers were normalized to the number of identified proteins at the different similarity levels. The lowest rate of “over” clustering, that is, only a few clusters containing proteins from more than one gene, was achieved at SL95. The lowest rate of “under” clustering, that is, only a few examples of proteins from the same gene in more than one cluster, was achieved at SL65. At SL80, the levels of “under” and “over” clustering were comparable, thus the number of proteins identified at this level approximately corresponds to the “true” number of different proteins identified, given that only one protein is identified per gene. Evaluation of Isotope Incorporation and Normalization. The isotopic incorporation of the M-4A4 proteome and the quantification profile were examined to ensure correct quantification. The quantification values of proteins from isotopelabeled M-4A4 cells were obtained from passage three through eight of isotopic incorporation. As single peptide identifications and quantifications are more error prone, these were removed Journal of Proteome Research • Vol. 8, No. 6, 2009 3085

research articles

Lund et al.

Table 3. Descriptive Statistics of the LC-MS/MS Data Obtained from the Membrane Purification Procedure 2 Based Strategy Identified proteinsb Proteins quantifiedc Proteins quantified (%)c Membrane proteinsd Membrane proteins (%)d Proteins from EGMd Proteins from EGM (%)d Proteins, M-4A4 > NM-2C5e Proteins, M-4A4 > NM-2C5 (%)e Proteins, M-4A4 < NM-2C5e Proteins, M-4A4 < NM-2C5 (%)e SL95f % of identified proteins SL80f % of identified proteins SL65f % of identified proteins

average ((dev.)a

totala

735.0 ((240.0) # # 396.5 ((65.5) 57.1 ((9.7) 185.5 ((32.5) 25.2 ((2.9) 29.5((23.5) 3.3 ((2.1) 63 ((15) 8.8 ((0.8) 411 ((155) 54.9 ((3.2) 317.5 ((131.5) 41.8 ((4.2) 279.5 ((121.50) 36.5 ((4.6)

1165 # # 585 50.2 279 23.9 57 4.9 114 9.8 676 58.0 521 44.7 460 39.5

a For each strategy tested by multiple independent samples the total values (per strategy), the average values (per sample) and the standard deviation (strategies with three independent samples) or the deviation (strategies with two independent samples) are included. b Number of proteins identified by VEMS, including all the different entries identified from the IPI-database, for example, two different isoforms of one protein were included as two different IPI entries even though these might be identified by the same peptides. c Percentage of quantified proteins is not included because different labeling strategies were used for the two samples prepared by purification procedure 2. d Number of membrane proteins and proteins from membranous organelles: endoplasmatic reticulum, Golgi and mitochondria (EGM). e Number of proteins differentially expressed, higher or lower, respectively, in M-4A4 compared to NM-2C5. f Redundancy of the identified proteins is reduced by clustering at similarity level (SL) SL95, SL80, and SL65. These levels give an indication of how many different proteins (not IPI entries) were identified.

Figure 5. Ratio of membrane proteins to those from ER, Golgi, mitochondria (EGM) in strategies A-D and the procedure 2 based strategy. Strategy C has the highest value.

and the average quantification values of the identified proteins with two or more peptides were calculated. A quantification value of 100% corresponded to complete incorporation of the stable isotopic labels, but values below should be expected due to the interference of background “noise”. The corrected values varied only from 94.1% to 95.6% and had low standard deviations (data not shown). Likewise, the average quantification value of the identified proteins by two or more peptides from pure unlabeled cells was 6.2% with a standard derivation of 2.8 (data not shown). To evaluate the normalization function, the quantification values of each protein obtained by combining all strategies were plotted in a histogram (Figure 7). If the normalization exhibited a general error, for example, was based on reference 3086

Journal of Proteome Research • Vol. 8, No. 6, 2009

proteins not equally expressed in both cell lines, this would be identified as an overall shift of the center of the values in the histogram. The histogram was, as expected, Gaussian-shaped, with a center of 48% (equal protein amount in M-4A4 and NM2C5 ) 50%), indicating that this normalization procedure did not favor either high quantification values or low quantification values. By combining data from all sample analyses (disregarding potential IPI number replicates) 6135 proteins were identified, of which 4466 were quantified. The majority of the quantified proteins (74%) exhibited only expression level difference in the 40-60% range and thus did not reach the cutoff levels. Proteins with 2-fold or higher expression and 2-fold or lower expression in M-4A4 compared to NM-2C5 constituted 421 and 724, respectively. Marker Identification. An average of 6% (4-8%) of the identified proteins had higher expression levels while 8% (5-19%) had lower expression levels in M-4A4 compared to NM-2C5. In-depth analysis of the combined data set revealed that 43 proteins had higher and 7 had lower expression in M-4A4 than NM-2C5. An example of the SILAC-peaks of a protein (integrin β1) from two separate experiments exhibiting consistently higher expression in M-4A4 vs NM-2C5 is shown in Figure 8. The cellular localization of the 43 proteins with higher and 7 with lower expression in M-4A4 compared to NM-2C5 were examined. Thirteen of the proteins with higher expression and three proteins with lower expression were known or highly likely cell surface membrane proteins (Tables 5 and 6). Validation of the alteration in protein expression of these protein and study of their clinical utility as metastasis-associated markers is described elsewhere (Leth-Larsen et al. 200931).

Discussion Target identification is a major effort in biotech drug discovery and development, and detailed study of these candidate targets may yield insight into the biology of diseases. Since most of these drug targets are proteins, particularly cell surface membrane proteins, proteomic analysis of the cellular plasma membrane proteins promises to be an efficient discovery tool. We aimed at combining and optimizing several technologies for efficient identification of cell surface proteins exhibiting altered expression between disease stages. As a model system, an unique isogenic cancer cell line pair for analysis of proteins involved in metastasis development was investigated.23 Stable isotope labeling by amino acids in cell culture (SILAC), membrane separation by a Percoll/sucrose density gradient monitored by marker enzyme assays and mass spectrometry were optimized and used for protein identification and quantification. Since SILAC can be performed with a number of stable isotope-labeled amino acids and digestive enzymes,20,32 we initially compared the use of a single stable isotope labeled amino acid with that of two different labeled amino acids. Combination of trypsin digestion with one isotope label (either arginine or lysine) allows labeling of approximately half of the produced tryptic peptides, that is, those that contain the isotope labeled amino acid of choice. As most proteins are identified by more than one peptide, a majority of proteins would, in theory, be quantifiable. However, only the most intense peptide peaks in the mass spectra are selected for MS/MS sequencing, and as the intensity of peaks from the peptides containing the potentially labeled amino acid, for example, arginine, is divided in to two peaks, for example, one 12C-arginine (light) and one

research articles

Cell Surface Protein Isolation and Identification

Table 4. Total Number of Identified Proteins Obtained by Combining the Data Sets from All the Various Strategies and Analyzing Them at Similarity Levels 95, 80, and 65% number of proteins

totala

peptides g 2b

quantified

samples g 2c

membrane proteins

membrane proteins and peptides g 2d

membrane proteins and samples g 2e

Identified SL95f SL80g SL65h

1919 1122 826 724

1145 682 509 450

1425 844 623 556

932 506 369 316

1063 623 481 426

622 379 300 266

528 295 224 195

a All identified proteins. b Number of identified proteins identified by two or more peptides. c Number of identified proteins in two or more samples. Number of membrane proteins identified by two or more peptides. e Number of identified membrane proteins in two or more samples. f Similarity level 95%. g Similarity level 80%. h Similarity level 65%.

d

Figure 6. Clustering at different similarity levels. All identified proteins were clustered at three different similarity levels: 65, 80, and 95%. For all levels, the number of clusters containing proteins originating from more than one gene was identified and designated “over clustering” ((). Proteins originating from the same gene and present in more than one cluster were counted and designated “under clustering” (∆) to reduce the redundancy of proteins identified by the same peptides without clustering different protein identifications.

Figure 7. Quantification value profile by number of quantifications in each interval. 13

C-arginine (heavy) signal, fewer peptides containing the potential labeled amino acid, for example, arginine, as compared to the unlabeled amino acid, for example, lysine, will be identified. Conversely, the peptides containing the unlabeled amino acid and therefore not-quantifiable, for example, lysine will more often be identified and thus a lower the percentage of proteins will be quantified. By combining trypsin cleavage with incorporation of both 13C-arginine and 13C-lysine, all peptides except peptides derived from the protein C-terminus would be quantifiable and equally identifiable. In fact, dual labels, compared to a single label, increased the percentage of quantified proteins from 40 to 93% but lowered the total number of identified proteins, probably as a consequence of the increased numbers of peptide peaks caused by the use of dual isotope-labeled amino acids, which in turn require more

MS/MS cycles for sequencing. Due to sample heterogeneity, repeating the analysis of each sample 2-4 times by LC-MS/ MS made it possible to sequence different, primarily low abundant peptides, thereby increasing the total number of identified proteins from 329 to 525 (n ) 106 to n ) 181 at SL80). Cell surface membrane proteins are generally of low abundance and additional separation of the complex mixture of membrane proteins may increase the number of identified proteins. Percoll/sucrose density gradients separate cellular membranes (and organelles) according to their density. In most strategies examined in this study, the fractions with the highest activity of a particular cell surface enzyme marker and the lowest activity of a particular mitochondrial enzyme marker were pooled prior to LC-MS/MS analysis. Alternatively, six to eight fractions were analyzed separately, resulting in identification of an increased number of proteins, although the total percentage of membrane proteins was lower. The gradient used in our study enabled separation of cell surface proteins into the fractions of the gradient selected for analysis with an increased percentage in the early fractions (fraction 3 and 4), while mitochondrial proteins were present in the late fractions. Proteins from other subcellular compartments furthermore showed a distinct pattern in the distribution with proteins from ER being present in the middle fractions (fraction 4 and 5) and proteins from Golgi in fraction 6. Some fractions contained proteins from more than one compartment likely because these proteins are present in several cellular compartments as a result of their route of processing. Many of these proteins are synthesized in the ER, processed in the Golgi and may further be cycled between these organelles and the cell surface.33 An ideal purification of cell surface proteins contains the entire plasma membrane proteome and no contaminating proteins from other organelles. The percentage of membrane proteins varies from 42 to 67% of the number of identified proteins with no clear correlation to membrane purification procedure 1 or 2 or any of the other previously mentioned parameters. However, the percentage of proteins from EGM shows a clear tendency: in the strategies based on membrane purification procedure 1 that includes a Percoll/sucrose density gradient, only 16-20% of the total number of identified proteins were from EGM, while the strategies based on membrane purification procedure 2 (without gradient centrifugation) between 22 and 34% of the identified proteins were from EGM. On the other hand, the total number of identified proteins was higher in the strategies without the centrifugation gradient as compared to the strategies including the gradient, likely due to loss of proteins during the extra purification step. Recently, a number of proteomic studies examining cell surface proteins have been reported.3,5-13 The number of identified proteins in these studies ranges from less than 400 to almost 1000, and the percentage of membrane and cell Journal of Proteome Research • Vol. 8, No. 6, 2009 3087

research articles

Lund et al.

Figure 8. SILAC peaks of a tryptic peptide from integrin β1 in two different LC-MS/MS analyses (a and b) demonstrating the reproducibility of the method. The three peaks to the left originate from the 12C-arginine peptide (NM-2C5), while the three peaks on the right originate from the 13C-arginine peptide (M-4A4). Table 5. Membrane Proteins Exhibiting Higher Expression in Metastatic M-4A4-Cells as Compared to Non-Metastatic NM-2C5-Cells

protein name

UniProt accession numbers

identified/ quantified VEMS in number of number of sequence quantification samples peptides coverage range

gene

HLA- DRR

P01903 Q30118

HLA-DRβ

P04229 Q30134 P13761 HLA-DRB (1,3,5) P01913 P79483

Sulfate transporter Ecto-5′ nucleotidase Integrin β1 Integrin R6 Annexin A2 ICAM-1 BSCv protein Integrin RV CD74 Protein NDRG1 ADP-ribosylation factor 4

P50443 P21589 P05556 P23229 P07355 P05362 Q9HDC9 P06756 P04233 Q8SNA0 Q92597 P18085

HLA-DRA

SLC26A2 NT5E ITGB1 ITGA6 ANXA2 ICAM1 C20orf3 ITGAV CD74 NDRG1 ARF4

fold difference

P-values

2 fold- only in M-4A4 >2 fold

76.2-81.5 58.2-69.3 59.6-78.2 65-75.3 52-65.6 50.1-62.4 57.8-74.7 55.6-59.7 87.8-94.5 74.1-74.9 66.7

>2 fold g2 fold >2 fold >2 fold g2 fold 2 fold 2 fold 2 fold only in M-4A4 >2 fold 2 fold

0.06 0.01