Metabolic Labeling and Protein Linearization Technology Allow the

Aug 27, 2009 - Metabolic Labeling and Protein Linearization Technology Allow the. Study of Proteins Secreted by Cultured Cells in Serum-Containing. Me...
0 downloads 0 Views 276KB Size
Metabolic Labeling and Protein Linearization Technology Allow the Study of Proteins Secreted by Cultured Cells in Serum-Containing Media M. Colzani,*,† P. Waridel,† J. Laurent,‡ E. Faes,‡ C. Ru ¨ egg,‡,§ and M. Quadroni*,† Protein Analysis Facility, Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland, Division of Experimental Oncology, Multidisciplinary Oncology Center (CePO), Centre Hospitalier Universitaire Vaudois (CHUV), University of Lausanne, Lausanne, Switzerland, and National Center of Competence in Research (NCCR) Molecular Oncology, ISREC, School of Life Sciences, EPFL, Lausanne, Switzerland Received May 29, 2009

Supernatants from cell cultures (also called conditioned media, CMs) are commonly analyzed to study the pool of secreted proteins (secretome). To reduce the exogenous protein background, serum-free media are often used to obtain CMs. Serum deprivation, however, can severely affect cell viability and phenotype, including protein secretion. We present a strategy to analyze the proteins secreted by cells in fetal bovine serum-containing CMs, which combines the advantage of metabolic labeling and protein concentration linearization techniques. Incubation of CMs with a hexapeptide ligand library was used to reduce the dynamic range of the samples and led to the identification of 3 times more proteins than in untreated CM samples. Labeling with a deuterated amino acid was used to distinguish between cellular proteins and homologous bovine proteins contained in the medium. Application of the strategy to two breast cancer cell lines led to the identification of proteins secreted in different amounts and which could correlate with their varying degree of aggressiveness. Selected reaction monitoring (SRM)based quantitation of three proteins of interest in the crude samples yielded data in good agreement with the results from concentration-equalized samples. Keywords: stable isotope labeling • protein concentration linearization • secretome analysis • LC-MS/MS

Introduction Proteins actively secreted by cells fulfill a number of biological functions, essential for the interactions with the external environment. Secreted proteins can affect the development (e.g., growth factors), the adhesion, and the migration (e.g., matrix proteins, chemokines) of neighboring cells. They can also influence the body’s metabolism (hormones) and mediate immune functions (e.g., antibodies, cytokines), through autocrine, paracrine and endocrine signals. The study of the pool of secreted proteins (the secretome) can provide a better understanding of the molecular mechanisms underlying these processes in health and in disease conditions.1 Besides actively secreted proteins, other cellular polypeptides can enter the bloodstream, either through passive release by leakage from damaged tissues or shedding from the cell surface. For all these reasons, numerous studies have focused on the serum and plasma proteomes to discover disease biomarkers and targets for therapeutics.2-4 But the extreme complexity of the blood proteome and the very large dynamic range of protein con* To whom correspondence should be addressed. Manfredo Quadroni, phone 41-21-692-56-76, fax 41-21-692-57-05, e-mail manfredo.quadroni@ unil.ch. Mara Colzani, phone 41-21-692-56-77, fax 41-21-692-57-05, e-mail [email protected]. † Center for Integrative Genomics, University of Lausanne. ‡ Centre Hospitalier Universitaire Vaudois (CHUV), University of Lausanne. § School of Life Sciences, EPFL. 10.1021/pr900476b CCC: $40.75

 2009 American Chemical Society

centrations make the analysis of serum/plasma samples extremely challenging.5,6 Proteomic workflows based on sample fractionation and enrichment of low-abundance proteins have been applied to achieve a more comprehensive characterization of serum/plasma.7-11 In spite of this, it is the common opinion that the analytical depth available for untargeted proteomics remains insufficient for comprehensively mapping serum/ plasma samples. Only a few studies have characterized the cellular secretome in vivo,12,13 because of the inherent difficulties. As a proxy, a commonly used approach is the analysis of media conditioned by cells in culture (CM).14-16 This model assumes that cells grown in vitro present a secretion phenotype similar to the one in vivo. The obvious advantage of cells in culture is the possibility to study variations of the secretome induced by specific events (e.g., treatment with growth factors, drugs or other stimuli). Despite the complexity reduction and the experimental advantages offered by in vitro model systems, even secretome profiling based on CM analysis presents serious challenges. The first one is the low concentration (as low as ng/mL) at which the proteins are secreted in the CM.17 The second is that the actively secreted proteins are mixed with the ones passively released by cell death and lysis, which “contaminate” the sample. Proteins released by lysed cells can indeed represent an important proportion of the CM proteome and can be Journal of Proteome Research 2009, 8, 4779–4788 4779 Published on Web 08/27/2009

research articles difficult to distinguish from the actively secreted ones. This issue has been addressed in some recent studies, which exploited different approaches to recognize the genuine secretory nature of identified proteins: analysis of the available literature,18,19 computational localization of the signal peptide within the protein sequences,20,21 use of specific databases,18,22 and comparison between the CM and the cytosolic extract.17 Another challenge for secretome analysis is represented by the presence of serum in the CM. Serum is generally essential for cell survival and growth and is therefore supplemented to the cell culture medium, typically from an exogenous source (for example, 10% FBS). Cellular proteins secreted in serumcontaining CM are more difficult to detect because they are diluted in an already extremely complex mixture of proteins. Moreover, the partial sequence identity with orthologous (i.e., bovine) proteins can generate misleading identifications and prevent the assignment to the correct species. These problems can be partially solved by the use of serum-free medium (SFM) for a defined time frame, during which the secreted proteins can accumulate without mixing with serum ones. This strategy reduces both the proteome complexity and the risk of misleading identifications, and it has been successfully applied to secretome analysis in a number of studies.21,23,24 Serum deprivation, however, can slow down and even stop cell proliferation25,26 and at the same time increase cell death,27 thus amplifying the passive release of intracellular proteins into the CM. Moreover, the presence/absence of serum in the medium can significantly affect the pattern of secreted proteins.28 To produce informative data, the profiling of serum-containing CM would require the reduction of the dynamic range of protein concentration, which can be provided by fractionation, depletion or enrichment techniques (as for serum/plasma samples). Among the different available approaches, a strategy based on the “equalization” of protein concentrations has been recently developed and successfully applied to serum studies.29,30 It is based on a library of hexapeptide ligands generated by combinatorial chemistry and immobilized on beads, which are then incubated with complex protein mixtures. By providing a large variety of binding sites, each in limited number, the hexapeptide resin is expected to reduce the dynamic range of protein concentrations, enriching the low abundant proteins relative to the most abundant ones. As a second requirement, a successful analysis of serumcontaining CM would also require a way to discriminate cellular secreted proteins from serum proteins. This can be achieved by metabolic labeling with heavy amino acids added into the culture medium and thus incorporated into the newly synthesized proteins. After MS analysis, the presence of the heavy amino acids in the peptide sequences proves the cellular origin of the corresponding protein, distinguishing it from the serum ones. Metabolic labeling with isotope labeled amino acids is now routinely used in quantitative proteomics, but is mostly applied to quantify variations in intracellular proteins.31-33 Only recently it has been employed in the analysis of proteins secreted by pancreatic cancer cells24 and by visceral adipose tissue.34 True proteome quantification has been exploited for secretome analysis in a few recent studies.24,35 Here we present a LC-MS/MS based strategy to study serum-containing CMs which combines metabolic labeling with protein concentration “equalization”. An aggressive breast cancer cell line (MDA-MB-231) grown in a standard medium with 10% FCS was labeled with an excess of deuterated valine 4780

Journal of Proteome Research • Vol. 8, No. 10, 2009

Colzani et al. (D8-Val). The obtained CM was treated with the hexapeptide resin and, after protein identification, the presence of the isotope label was used to discriminate true cellular proteins from homologous bovine ones. We show that comparative analysis is possible by applying the same workflow to a second CM conditioned by a less aggressive breast cancer cell line (MCF-7). Relative quantification of validated cellular proteins was done using spectral counting36 and confirmed for three proteins using selected reaction monitoring (SRM) on proteotypic peptides. Among the proteins found to be differentially observed, there were growth factors and components of the extra-cellular matrix reported to be involved in cell migration or cancer progression.

Materials and Methods Cell Culture Labeling and Media Collection. Metabolic labeling was performed on two human breast cancer cell lines: MDA-MB-231 and MCF-7 (both obtained from ATCC/LGC Promochem). MDA-MB-231 and MCF-7 cells were grown in standard RPMI-1640 medium supplemented with L-glutamine (Biochrom), 10% FBS (Invitrogen) and penicillin-streptomycin (Invitrogen). Insulin (Actrapid) was added to the medium of MCF-7 cells, at the concentration of 0.01 mg/mL. Cells were at first grown until confluence, to expand the cultures. Cultures were then split and cells were metabolically labeled during their exponential growth phase, using the complete medium described above, supplemented with deuterated L-valine (D8-val, 98% enrichment, Cambridge Isotope Laboratories). The final concentration of D8-valine in the labeling medium was 100 mg/ Liter, equal to a 5-fold excess over unlabeled Val present in RPMI-1640 medium according to its standard composition. The media conditioned by MDA-MB-231 or MCF-7 cells were collected after 3 days of labeling, centrifuged at 500 g to remove cells, filtered through a 0.22 µm cartridge, frozen in liquid nitrogen and stored at -80 °C. The protein concentration of MDA-MB-231 and MCF-7 conditioned media (CMs) was determined using the Bradford protein assay (Bio-Rad); the relative protein concentration of samples to be compared was verified by whole-lane densitometry on SDS-PAGE gels after Coomassie staining. Protein Equalization with Proteominer. MDA-MB-231 and MCF-7 CMs were loaded on Proteominer beads (Bio-Rad), to reduce the protein dynamic range of the samples. 150 µL of beads slurry were used to process 15 mg of protein, according to the protein:resin proportion indicated in the manufacturer’s instructions for human serum samples (500 µL of resin for 50 mg of plasma/serum proteins). The equalization steps were performed as described in the manufacturer’s protocol. Briefly, the beads were mixed with the appropriate volume of cell culture media and incubated overnight at 4 °C. After a wash step, the proteins retained by the beads were eluted using the elution buffer provided in the kit, which contained 4 M urea, 1% CHAPS. The eluates were frozen and kept at -20 °C. SDS-PAGE and Protein Digestion. Proteins in the raw and equalized CMs (both MDA-MB-231 and MCF-7) were separated on a 10% SDS-PAGE minigel. For the raw CMs, we overloaded the gel with 60 µg of proteins, to detect a maximum of proteins other than albumin (see Figure 1 for the MDA-MB-231 CM). For the equalized CM, the Proteominer eluate was mixed 1:1 with 5× concentrated Laemmli sample buffer before SDSPAGE. The amount of equalized CMs loaded was equal to 20 µg of proteins and corresponded to the maximum volume possible to load without having band distortion due to the

Secretome of Serum-Containing Conditioned Media

Figure 1. SDS-PAGE of raw and Proteominer-treated conditioned media. Representative gels for the MDA-MB-231 CM before (left) and after (right) protein concentration equalization. The gel was stained with Coomassie Brilliant Blue.

chemical composition of the Proteominer elution buffer. After staining with Coomassie Blue, each lane was cut into ten fractions corresponding to regions of different molecular weights. The excised slices were in-gel digested with sequencing grade trypsin (Promega, Madison, WI). The alkylation of cysteine residues and the proteolysis were automatically performed in a ProGest robotic workstation (Genomic Solutions, Ann Arbor, MI) according to a described protocol.37,38 LC-MS/MS. Data-dependent LC-MS/MS analysis of extracted peptide mixtures after digestion was performed on a hybrid linear trap LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific) interfaced via a TriVersa Nanomate (Advion Biosciences) to an Agilent 1100 nano HPLC system (Agilent Technologies). Peptides were separated on a ZORBAX 300SB C18 (75 µm ID × 150 mm, 3.5 mm) capillary column (Agilent Technologies) along a 90 min gradient from 5 to 85% acetonitrile in 0.1% formic acid. The mass spectrometer operated in information-dependent acquisition mode to automatically switch between MS and collision-induced dissociation MS/MS for the four most intense ions; single charged peptides were not selected for MS/MS fragmentation. Formerly targeted peptides were dynamically excluded from the analysis for 120 s after one occurrence. Protein Identification. For each sample, the MS/MS spectra obtained from different fractions of the same gel lane were automatically merged and deisotoped using Mascot Distiller (Matrix Science, London, UK; version 2.1.1), using a minimum signal/noise ratio equal to 5 and no limitations for the peptide charge. The spectra were searched in the mammalian and human subsets of the UniProt database (www.expasy.org) using Mascot (Matrix Science, London, UK; version 2.2.0) with trypsin specificity, one allowed missed-cleavage, parent ion tolerance of 10 ppm and fragment ion mass tolerance of 0.5 Da. The UniProt database release used was 11.0 of January, 29th 2008 (5 473 336 total sequences, of which 74 609 were humans and 245 323 mammalians). The iodoacetamide derivative of cysteine was specified as a fixed modification and oxidation of methionine as a variable one. D8-Val (+8 Da) was set as variable modification for searches in the mammalian database or as a fixed modification for searches in the human database. Mascot output was imported in Scaffold for further processing (described in the following section). Mascot was used to compute the false discovery rate (FDR) of protein identification in the mammalian database, by searching against an automatically generated decoy database, containing random sequences having the same length and

research articles average amino acid composition of the forward “normal” database. The multidimensional protein identification technology (MudPIT) scoring system was applied, with significance threshold at p ) 0.05 and minimum ion score equal to 20. The FDR we obtained were equal to 2.74 and 2.84% for the “equalized” MDA-231 and MCF-7 CM, respectively. The FDR for the raw MDA-231 and MCF-7 CM were 3.67 and 1.80%, respectively. The FDR’s of protein identification in the human database are scarcely significant because the presence of vast a majority of bovine peptides in the samples results in a relatively low number of matched spectra against the human database, which in turn gave abnormally high FDR values. Additional statistical criteria to measure identification certainty in both mammalian and human databases were applied in the subsequent steps of data analysis and are described in the next section. Data Analysis. To increase the confidence in the identified proteins, X! Tandem (www.thegpm.org; version 2007.01.01.1) was set up to search the subset of the UniProt database composed of proteins previously identified by Mascot. The same settings previously used for Mascot search were applied, but the fragment ion mass tolerance was 0.1 Da and two missed cleavages were allowed. The peptide identifications obtained from Mascot or X-Tandem! were automatically merged by Scaffold (version Scaffold-01_06_03, Proteome Software Inc.). Scaffold was used for probability-based validation of the identified peptides and proteins (minimum 95% peptide probability, 99% protein probability and 2 identified peptides/ protein),39,40 for data set alignment as well as for parsimony analysis to discriminate homologous hits. The complete list of proteins and peptides identified in the raw and equalized samples, together with their accession number, scores, sequence coverage and number of assigned peptides, is reported in Supplementary Table 1 (Supporting Information). For proteins sharing the same set of peptides, only one representative protein is listed. Proteins with matched peptides in common with other sequences were validated only if matched by at least one unique discriminating peptide. No further attempt was made to discriminate protein isoforms that were indistinguishable with the available mass spectrometry data. Only the proteins showing at least two peptides containing deuterated valine (D8-Val) were validated as cellular proteins. Relative quantification in the two CMs was based on the number of spectra assigned to each validated protein;36 the number of assigned spectra was not normalized between the two CMs, for reasons explained in the results section. Only proteins identified in one CM with at least 10 matched spectra and showing a difference in the number of assigned spectra higher than 3-fold (decrease or increase) were considered significantly differentially expressed between the MDA-MB-231 and MCF-7 samples. Protein Precipitation and in Solution Digestion of Raw CMs for SRM Analysis. For each MDA-MB-231 and MCF-7 CM, the volume of sample corresponding to 0.6 mg of proteins was incubated with 10% TCA in 80% acetone (o/n at -20 °C). The samples were centrifuged (18 000 g for 15’ at 4 °C), the pellet was washed with cold acetone and centrifuged again. The resulting pellets were solubilized in 8 M urea, 100 mM Tris, pH 7.4, reduced with DDT (30 min at 37 °C), alkylated with iodoacetamide (20 min at RT), diluted with 50 mM ammonium bicarbonate buffer and digested with trypsin (o/n at 37 °C). The peptide mixtures were desalted on C18 cartridges (SepJournal of Proteome Research • Vol. 8, No. 10, 2009 4781

research articles Pak, Waters), evaporated on a SpeedVac and resuspended in 3% acetonitrile, 0.1% formic acid for LC-MS analysis. Peptide Synthesis. The six peptides ELSEALGQIFDSQR, YSSDYFQAPSDYR, LEGEACGVYTPR, GDPECHLFYNEQQEAR, YGGDEIPFSPYR and GAGTGGLGLAVEGPSEAK from the human proteins LG3BP, IBP2 and FLNA (see Supplementary Table 7, Supporting Information) were synthesized by Fmoc chemistry. The two peptides containing cysteine residues were incubated with DTT (5-fold molar excess, 45 min at 56 °C) and alkylated with iodoacetamide (5-fold molar excess, 1 h incubation at RT) to obtain the same molecular species detected in the CM’s. LC-SRM. The peptides obtained from the in-solution digestion of the raw CMs were analyzed by selected reaction monitoring (SRM).41 The samples were injected on a reversedphase C18 column (PepMap100, 3 µm, 100 Å, LC Packings) and separated by nanoflow liquid chromatography (250 nL/min) on an Eksigent Tempo nano LC (Applied Biosystems) system, on line with a Q-TRAP 4000 instrument (Applied Biosystems), using a 56 min gradient from 5% to 85% acetonitrile in 0.1% formic acid. Nine peptides from the human proteins LG3BP, IBP2 and FLNA, 3 peptides from the bovine proteins ALBU, ITIH2 and A2M, 1 peptide from porcine trypsin and the glufibrinopeptide (spiked in the sample before the injection) were monitored simultaneously, for a total of 34 transitions, in alternative scans occurring every 1.8 s. The transitions being monitored, together with the Collision Energy we used and the dwell time dedicated to each transition are listed in Supplementary Table 7 (Supporting Information). Each peptide was specific for a unique protein in the human or bovine databases (checked using BLAST, http://www.expasy.ch/tools/blast). Typical chromatographic peaks width was bigger than 20 s at the base, which ensured a collection of >10 points per chromatographic peak. The peak areas of the extracted ion chromatograms (XICs) of each transition were automatically computed by Analyst 1.4 software (Applied Biosystems), using a dedicated quantitation method. Regarding LG3BP, IBP2 and FLNA peptides, we quantified the peak presenting three transitions coeluting at the retention time observed for the corresponding synthetic peptides. For ALB, ITIH2 and A2 M bovine peptides, we quantified the most intense peak presenting coeluting transitions. The peak areas of the different transitions (2 or 3) deriving from the same peptide mass were automatically summed, to improve peak detection and obtain better quantification. Peak integration was manually checked and corrected if necessary. The values were normalized to the sum of the values obtained for the group of three bovine proteins (ITIH2, ALB and A2M), expected to be present in the two CMs in the same amount. For each sample, the data were acquired in three independent runs.

Results 1. Preparation of Conditioned Media. The two breast cancer cells lines MCF-7 (poorly aggressive) and MDA-MB-231 (highly aggressive) were expanded in a standard RPMI-1640 medium supplemented of 10% bovine serum. For the preparation of the conditioned medium (CM), a 5-fold excess of deuterated valine (D8-Val) over the valine normally present in the medium was added. After 3 days incubation, the CM was collected as indicated in the materials and methods section. The incorporation of the heavy amino acid was checked on human proteins identified by a preliminary shotgun LC-MS 4782

Journal of Proteome Research • Vol. 8, No. 10, 2009

Colzani et al. analysis of the CM; on average, 75% of the valine-containing peptides were found to contain D8-Val (data not shown). 2. Dynamic Range Reduction in a Serum-Containing Conditioned Medium. The efficiency of the protein concentration equalization technology was evaluated on the culture medium conditioned by MDA-MB-231 cells. The volume of medium corresponding to 15 mg of total proteins was incubated with the Proteominer resin and treated according to the manufacturer’s protocol. After incubation and washing, the eluate (the equalized CM) and the starting material (the raw CM) were loaded on SDS-PAGE and stained with Coomassie Blue, to visualize the profile of their protein contents (Figure 1). It is possible to observe that the lane corresponding to the equalized sample shows a rather uniform pattern of protein bands spanning the whole range of molecular masses. In the raw sample lane, the pattern is dominated by serum albumin (68 kDa) and only few bands are visible at low molecular masses. This simple comparison of the distribution of protein masses on SDS-PAGE provided a first evidence of the enrichment of less abundant proteins relative to the most abundant ones provided by the “equalization” treatment. 3. Shotgun Analysis of the MDA-MB-231 Conditioned Medium. To analyze more accurately the changes in the protein profile due to the equalization process, raw and equalized samples were analyzed by geLC-MS/MS. Each sample was separated on a preparative SDS-PAGE, divided in 10 fractions corresponding to different intervals of molecular masses, digested with trypsin and sequentially analyzed by LC-MS/ MS on a high resolution instrument (LTQ-Orbitrap). Lists of MS/MS spectra of every lane were pooled before database search. To reduce any bias in favor of the equalized sample, we loaded (as done by42) a lane with a 3-fold higher amount of raw CM proteins, which corresponded to the maximum amount loadable in the gel without causing major band distortion during migration. For each of the two samples, the spectra collected from the 10 fractions were pooled and first searched against the mammalian or human subset of the UniProt database. The results are reported in Figure 2A. It should be noted that, although the bovine genome has been fully sequenced, its assembly is still ongoing and gene model prediction is incomplete. Therefore the available UniProtKB bovine protein database must be considered as incomplete (http://www.uniprot.org/taxonomy/9913). The number of submitted spectra for the search in the mammalian database was 20% higher for the raw sample (12 983 spectra) than for the equalized one (10 859 spectra). Despite this fact, the number of identified proteins was much higher in the Proteominer-treated sample (443 vs 164 proteins, corresponding almost to a 3-fold difference). Accordingly, the numbers of spectra matching to serum albumin, alpha-2macroglobulin and serotransferrin, very abundant proteins in bovine serum, show a significant decrease after the equalization treatment (Supplementary Table 2, Supporting Information). In the raw sample, 5627 over a total of 12 983 matched spectra (43.3%) were assigned to bovine albumin, 1131 to alpha-2macroglobulin (8.9%) and 821 (6.3%) to serotransferrin. After the equalization treatment, the values decreased to 969 out of 10 859 matched spectra for albumin (8.9%), 601 spectra for alpha-2-macroglobulin (5.5%) and 31 spectra for serotransferrin (0.3%). These data altogether confirm that the equalization process efficiently reduced the amount of the more abundant

Secretome of Serum-Containing Conditioned Media

research articles

Figure 2. Proteins identified in MDA-MB-231 (A) and MCF-7 (B) conditioned media before and after protein concentration equalization. The raw and Proteominer-treated CMs were analyzed by geLC-MS. The number of protein hits in the mammalian and human databases, together with the number of proteins validated as genuinely human (containing at least 2 D8-labeled peptides) are shown in the graph and reported in the table. Parameters used for identification and validation are described in the Materials and Methods section. Together, the 114 proteins identified in the MDA-MB-231 CM and the 303 proteins identified in the MCF-7 CM form the data set represented in Figure 3 (328 validated proteins).

Figure 3. Comparison of human proteins identified and validated in CMs after concentration equalization. Venn diagram of the number of validated human proteins found in the two linearized CMs: MDA-MB-231 and MCF-7. The two data sets of 114 proteins found in the MDA-MB-231 CM plus the 303 proteins found in the MCF-7 CM were aligned with the software Scaffold. Proteins containing 2 or more labeled peptides in at least one sample were validated as human.

proteins, reducing the dynamic range of protein concentration and allowing the detection of a higher number of proteins. The same two sets of spectra (raw and treated samples) were searched against the human database, as a first step in the identification of secreted cellular (human) proteins. By doing so, we identified 79 proteins in the raw sample and 298 in the equalized one (Supplementary Table 3, Supporting Information). These hits were screened for the presence of the D8-Val tag, to validate their human origin: only those proteins matched by at least two distinct D8-Val-containing peptides were considered reliable human proteins. The application of this constraint more than halved the number of retained proteins for the equalized sample and reduced it by 75% for the raw sample. We believe that most of the discarded matches were bovine proteins erroneously matched to the human database because of extensive sequence homology. Some of these proteins were matched with a large number of spectra, none of which however contained valine residues, since D8-Val was set as fixed modification. For example thrombin, vitamin D-binding protein, fibulin-1 and albumin were respectively matched in the human database by 106, 24, 81, and 85 spectra.

Such spectra however corresponded only to 3, 4, 9, and 4 unique peptides, which turned out to be identical in the bovine and human sequences and contained no valine residues. Finally, 19 proteins were confidently validated as human in the data from the raw CM, while the number increased to 114 in the equalized sample. The 6-fold increase demonstrates once again the gain in detection of the less abundant proteins brought by the equalization process. 4. Shotgun Analysis of the MCF-7 Conditioned Medium. The approach described above was applied to the differential analysis of cell culture media containing 10% bovine serum conditioned by the two cell lines MDA-MB-231 and MCF-7, derived from breast cancer at different stages of tumor progression. The MCF-7 CM was labeled, equalized and analyzed by shotgun LC-MS as described for the MDA-MB-231 sample (Figure 2B and Supplementary Table 4, Supporting Information). The equalized sample (12 714 spectra) matched 640 proteins with at least 2 peptides in the mammalian database and 560 proteins in the human database, of which 303 were validated as human by the D8-Val tag. Much fewer proteins were identified in the raw nonequalized sample: 178 (13 550 spectra) in the mammalian database and 83 in the human one; 18 of the latter ones were validated of human origin using the D8-Val tag. 5. Differential Analysis of MDA-MB-231 and MCF-7 Conditioned Media. It is possible to observe that the number of identified human proteins was 2.7-fold higher in the MCF-7 sample than in the MDA-MB-231 one (303 vs 114 proteins), despite that the amounts of submitted spectra were quite comparable (10 859 for MDA-MB-231 and 12 714 for MCF-7). In addition to this, many bovine proteins showed very similar spectral counts; albumin, for example, matched 999 spectra in MCF-7 CM and 969 in the MDA-MB-231 one (Supplementary Journal of Proteome Research • Vol. 8, No. 10, 2009 4783

research articles

Colzani et al. a

Table 1. Proteins More Abundant in MDA-MB-231 Conditioned Medium

UniProt ID kDa MDA-MB-231 spectral counts MCF-7 spectral counts MDA-MB-231/ MCF-7 ratio

Latent-transforming growth factor β-binding protein 4 Laminin R5 chain Laminin γ1 Proprotein convertase subtilisin/kexin type 9 Galectin-3-binding protein Basement membrane-specific heparan sulfate proteoglycan core protein Agrin Insulin-like growth factor-binding protein 2 L-lactate dehydrogenase B chain Cadherin-related tumor suppressor homologue

Q8N2S1

173

29

9

3.2

O15230 P11047 Q8NBP7 Q08380 P98160

400 178 74 65 469

61 34 26 43 27

15 8 5 8 3

4.1 4.3 5.2 5.4 9.0

O00468 P18065 P07195 Q14517

215 35 37 506

37 16 18 83

3 1 1 0

12.3 16.0 18.0 only MDA

a List of proteins found more abundant in MDA-MB-231 CM. The protein description, UniProt accession code, molecular mass, and spectral counts in MDA-MB-231 and MCF-7 CMs are reported in the table.

Table 4, Supporting Information). These evidence altogether suggested that, even though we have analyzed comparable total amounts of protein in the two CMs, the MCF-7 one actually contained a higher number of human proteins, since the bovine protein background was almost identical without need for normalization. We believe these differences were intrinsic to the sample and are not due to sample loading or recovery bias (see the Discussion section for probable causes). For this reason, no spectral count normalization based on total numbers of spectra was applied for the comparison of validated human proteins. Lists of validated human proteins obtained from the analysis of both the MDA-MB-231 and MCF-7 equalized CMs were imported in Scaffold software, aligned and compared. Each protein identified by at least 2 different D8-val labeled peptides in at least one of the two CMs was validated as human. Spectral counting was used as a crude quantitative measurement of protein concentration: the relative amount of each protein identified and validated as human in the two samples was estimated using the number of assigned spectra, as computed by the Scaffold software. Taking the two CMs together, a total of 328 proteins were validated as truly human cellular proteins (Supplementary Table 5, Supporting Information). Of these, 8 were found to be exclusively present in the MDA-MB-231 sample and 115 in the MCF-7 sample (Figure 3). The distribution of MDA-MB231:MCF-7 spectral count ratios for the 205 proteins present in both CMs appeared Gaussian (Supplementary Figure 1, Supporting Information) and spanned from a minimum value of 0.014 to a maximum one of 18, with a high number of proteins showing similar ratios close to the median value of 0.33. To highlight the proteins reliably and significantly more abundant in one of the two CMs, we decided to retain only proteins matched by at least 10 spectra and to consider as varying significantly only proteins with ratios higher than 3.0 or smaller than 0.33. After applying these criteria, we found 9 proteins significantly more abundant in the MDA-MB-231 medium and 70 in the MCF-7 one. In addition to these, one protein was exclusively detected in the MDA-MB-231 medium and 29 proteins only in the MCF-7 one (Supplementary Table 6, Supporting Information). Among the proteins found to be differentially present in the two CMs there were several growth factors and components of the extra-cellular matrix, proteins known to be secreted and to affect cell behaviors like growth, motility and invasiveness. However, also typically intracellular proteins such as heat shock proteins, proteasome subunits, nuclear ribonucleoproteins and 4784

Journal of Proteome Research • Vol. 8, No. 10, 2009

metabolic enzymes were found in significantly different amounts. These were generally more abundant in the MCF-7 sample, so that we speculate that they could have been passively released after cell death. The 10 proteins found more abundant in the MDA-MB-231 CM are reported in Table 1. These proteins are of higher biological interest because (i) due to the lower “background” level in the MDA-MB-231 secretome we can more reliably conclude that these proteins are highly secreted by this cell line and (ii) we are interested in factors specific for the more aggressive cells (MDA-MB-231). 6. Verification of the Quantification. Due to the nature of the sample treatment, the suitability of “equalized” samples for differential analysis remains to be established. We explored the possibility to confirm by Western blot on the raw CMs the protein quantification based on spectral counts performed on “equalized” samples. Unfortunately, we found it very difficult to find antibodies of confirmed specificity for the human proteins of interest and not cross-reacting with the bovine homologues, due to the extensive homology between human and bovine sequences. One of the few suitable antibodies was against laminin alpha 5, a protein found 4-fold more abundant in the MDA-MB-231 CM and not present in the bovine database. The blot was performed on MCF-7 and MDA-MB-231 crude CMs and is shown in Supplementary Figure 2 (Supporting Information). Densitometry analysis shows a 4.5-fold increase in MDA-MB-231 CM, which is in good agreement with the quantification obtained by spectral counting. We found no other suitable antibody against differentially secreted proteins. We thus decided to use an SRM (selected reactions monitoring)41 approach targeting human-specific peptides to quantify human proteins without interference from the bovine homologues. Among the human proteins found to be differentially present in the two CMs (Supplementary Table 6, Supporting Information), we selected 2 proteins found, by spectral counting, more abundant in MDA-MB-231 (LG3BP and IBP2) and one more abundant in MCF-7 (FLNA). For each protein we selected two putative proteotypic peptides43 particularly suitable for SRM quantification. In particular, the 6 selected peptides were detected during the previous LC-MS/MS analyses and provided intense MS/MS spectra; moreover their sequences contained neither methionine nor tryptophan residues (Supplementary Table 6, Supporting Information). For each peptide, we selected the three strongest precursor/fragment ion transitions based on the previous LC-MS/MS data. The sequences of the LG3BP and IBP2 peptides were present only in the human proteins and not in the bovine orthologs. Conversely,

research articles

Secretome of Serum-Containing Conditioned Media 44

Figure 4. SRM quantification of 5 peptides representing three proteins in raw CMs obtained from MDA-MB-231 and MCF-7 cells. For each peptide, the mean and the standard deviation deriving from 3 independent measurements are reported; the values are normalized against an internal standard calculated using 8 peptides deriving from 3 bovine proteins. The relative abundance of each peptide in the two CMs is reported on top of the bars. The p-values from the Welsh two samples t test demonstrated *