Temporal Quantitative Proteomics by iTRAQ 2D ... - ACS Publications

Jan 2, 2009 - Anjaiah Srirangam,¶ Theodore J. Standiford,† Philip C. Andrews,§ and Gilbert S. Omenn‡,#,∇. Divisions of Pulmonary and Critical ...
22 downloads 0 Views 4MB Size
Temporal Quantitative Proteomics by iTRAQ 2D-LC-MS/MS and Corresponding mRNA Expression Analysis Identify Post-Transcriptional Modulation of Actin-Cytoskeleton Regulators During TGF-β-Induced Epithelial-Mesenchymal Transition Venkateshwar G. Keshamouni,*,† Pratik Jagtap,§ George Michailidis,| John R. Strahler,§ Rork Kuick,⊥ Ajaya Kumar Reka,† Panagiotis Papoulias,§ Rashmi Krishnapuram,† Anjaiah Srirangam,¶ Theodore J. Standiford,† Philip C. Andrews,§ and Gilbert S. Omenn‡,#,∇ Divisions of Pulmonary and Critical Care Medicine and of Molecular Medical Genetics, Department of Internal Medicine, Michigan Proteome Consortium, National Resource for Proteomics and Pathways, and Department of Biological Chemistry, Department of Statistics, Biostatistics Core, Comprehensive Cancer Center, Department of Human Genetics, and Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, Michigan, 48109, and, Department of Medicine, Indiana University, Indianapolis, Indiana 46202 Received August 15, 2008

To gain insights into how TGF-β regulates epithelial-mesenchymal transition (EMT), we assessed the time course of proteins and mRNAs during EMT by multiplex iTRAQ labeling and 2D-LC-MS/MS, and by hybridization, respectively. Temporal iTRAQ analysis identified 66 proteins as differentially expressed during EMT, including newly associated proteins calpain, fascin and macrophage-migration inhibitory factor (MIF). Comparing protein and mRNA expression overtime showed that all the 14 up-regulated proteins involved in the actin-cytoskeleton remodeling were accompanied by increases in corresponding mRNA expression. Interestingly, siRNA mediated knockdown of cofilin1 potentiated TGF-β-induced EMT. Further analysis of cofilin1 and β-actin revealed an increase in their mRNA stability in response to TGF-β, contributing to the observed increase in mRNA and protein expression. These results are the first demonstration of post-transcriptional regulation of cytoskeletal remodelling and a key role for cofilin1 during TGF-β-induced EMT. Keywords: iTRAQ • quatitative proteomics • TGF-β • lung cancer • epithelial-mesenchymal transition • actin-cytoskeleton remodeling • beta-actin • cofilin1 • calpain • ezrin • moesin

Introduction Cancer cells attain the migratory and invasive capacities necessary for metastasis by undergoing a phenotypic conversion referred to as the epithelial to mesenchymal transition (EMT).1,2 During EMT, epithelial cells acquire fibroblastoid morphology, through down-regulation of epithelial-specific proteins and de novo expression of mesenchymal cell proteins.3 Cancer cells undergoing EMT attain self-sufficient autocrine growth signals, gain the ability to evade apoptosis, and become autonomous cellular entities along with the capacity to spread * To whom correspondence should be addressed. Venkateshwar Keshamouni, Ph.D., Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan Medical Center, 4062 BSRB, 109 Zina Pitcher Place, Ann Arbor, MI 48109. Phone, 734-936-7576; fax, 734615-2331; e-mail, [email protected]. † Divisions of Pulmonary and Critical Care Medicine, University of Michigan. § Michigan Proteome Consortium, National Resource for Proteomics and Pathways, and Department of Biological Chemistry, University of Michigan. | Department of Statistics, University of Michigan. ⊥ Biostatistics Core, Comprehensive Cancer Center, University of Michigan. ¶ Department of Medicine, Indiana University. ‡ Division of Molecular Medical Genetics, University of Michigan. # Department of Human Genetics, University of Michigan. ∇ Center for Computational Medicine and Biology, University of Michigan. 10.1021/pr8006478 CCC: $40.75

 2009 American Chemical Society

throughout the whole organism from their origin in the primary tumor.3,4 Elucidation of the molecular mechanisms underlying EMT may provide a new basis for understanding and preventing tumor metastasis.3 The multifunctional cytokine TGF-β is a potent inducer of EMT in several biological systems.5 In cancer biology, TGF-β plays a complex role: it may function both as a tumor suppressor in early stages of tumor development and as tumor promoter in late stages of tumor progression.6,7 TGF-β promotes tumor progression and metastasis by exerting pleiotropic effects on the neoplastic cells, including induction of EMT, and on the surrounding stroma.8,9 Increased expression of TGF-β occurs in many human cancers and is correlated with enhanced invasion and metastasis.10-12 Even though studies to date have identified several signaling pathways that are involved in TGF-β-induced EMT,4,13 the precise mechanisms by which TGF-β orchestrates this complex process of converting an epithelial cell into a mesenchymallike cell are far from clear. In the first phase of the present project, we demonstrated with quantitative differential proteomic analysis that TGF-β-induced EMT promotes migratory and invasive abilities of A549 lung adenocarcinoma cells and identified several proteins that are potentially important regulators of EMT.14 When an epithelial cell is transitioning into a Journal of Proteome Research 2009, 8, 35–47 35 Published on Web 01/02/2009

research articles

Keshamouni et al. a

Table 1. iTRAQ Labeling Scheme for Each 4-Plex Reaction

iTRAQ labels

1st replicate 2nd replicate

114

115

116

iTRAQ labels 117

(1A) C 2h 4h 8h (2A) C114 72 h115 72 h116 8 h117

114

(1B) C 16 h115 24 h116 48 h117 (2B) C114 16 h115 24 h116 48 h117

a To assess protein expression at different time points, two parallel 4-plex labeling reactions (A and B) were performed with control sample labeled with 114 tag in both 4-plex reactions. The analysis was performed with two independent biological replicates. In the second replicate, 4 and 8 h time points in reaction “A” were replaced with two biological replicates of 72 h time point due to lack of significant differential expression at 4 h and 8 h in first replicate, to assess an additional time point.

mesenchymal-like cell, it undergoes a robust actin-cytoskeletal reorganization that favors the migratory phenotype of the mesenchymal cell.415 Many of the proteins we identified as upregulated during EMT are critical for actin-cytoskeleton remodeling.16,17 To gain insights into the potential mechanisms by which TGF-β regulates these proteins, we performed a temporal quantitative protein expression analysis and correlated the protein expression with corresponding temporal mRNA expression. For quantitative global protein expression profiling, we employed iTRAQ labeling followed by 2D-LC-MS/MS analysis. In the iTRAQ methodology, peptides are labeled with isobaric chemical tags, which avoid mass heterogeneity for the parent ions and instead rely on quantitation of reporter groups released from the chemical tag during collision-induced dissociation.18 Among the advantages of this method is the ability to use up to four isotope tags in a single four-plex experiment. Exploiting this capability, we have analyzed quantitative differential protein expression at seven different time points in two 4-plex iTRAQ-labeling experiments, by including two independent control samples (0 h) in each 4-plex for effective normalization. We have demonstrated the ability of the isobaric tags to detect and quantify time dependent differences in expression levels of proteins between TGF-β-treated and untreated A549 cells that reflect the functional phenotype observed during EMT. Temporal mRNA expression profiling was carried out using Affymetrix human genome arrays.

Experimental Procedures Cell Culture. The A549 human lung adenocarcinoma cell line was obtained from the American type Culture Collection (Manassas, VA) and maintained in RPMI-1640 medium with glutamine, supplemented with 10% FBS, penicillin, and streptomycin and tested for mycoplasma contamination. All tissue culture media and media supplements were purchased from Life Technologies (Gaithersburg, MD). The porcine transforming growth factor beta 1 (TGF-β) was purchased form R&D systems (Minneapolis, MN).Cells were treated with TGF-β (5 ng/mL) for indicated times in the absence of serum and were serum-starved for 24 h prior to TGF-β treatment. siRNA Transfection. A549 cells were plated at 40-50% confluency; after 24 h, cells were transfected with indicated concentrations of a pool of 4 gene specific siRNA (SMARTpool from Dharmacon, Inc.) or scrambled control siRNA using lipofectamine following manufacturer’s protocol. After 4-6 h of transfection, cells were grown in 10% FBS for 24 h and serum-starved for the next 24 h, before culturing them in the presence or absence of TGF-β for the indicated time and concentration. Cells were either lysed with RIPA lysis buffer for Western blotting or photographed under light microscope. Western Blotting. Crude protein extracts were obtained by lysing 5 × 106 cells in a buffer containing 50 mM Tris-HCl (pH 7.6), 1% Nonidet P-40, 2 mM EDTA, 0.5% sodium deoxy36

Journal of Proteome Research • Vol. 8, No. 1, 2009

cholate, 150 mM NaCl, 1 mM sodium orthovanadate, 2 mM EGTA, 4 mM sodium p-nitrophenyl phosphate, and 100 mM sodium fluoride supplemented with protease inhibitors (leupeptin 0.5%, aprotinin 0.5% and phenylmethylsulfonyl fluoride 0.02%). Samples containing 20 µg of total protein were electrophoresed on SDS-polyacrylamide gels and transferred onto a PVDF membrane by electroblotting. Membranes were later probed with different primary antibodies as indicated, followed by horseradish peroxidase-conjugated mouse or rabbit secondary antibodies from Pierce (Rockford, IL) and West Pico chemiluminescence detection reagents from Pierce (Rockford, IL). Antibodies against the following proteins (by source) were used in the analysis: β-actin, (Sigma, St. Louis, MO); Cofilin1 (Chemicon, Inc., Temecula, CA); moesin (Cell Signaling, CA). Western blot analysis was performed three times for each protein and representative data from single experiment is provided. Protein Isobaric Labeling with iTRAQ Reagents. Total cell lysates were prepared as described for Western blotting and cellular debris was removed by centrifugation. Protein concentration was measured by Bio-Rad Dc protein assay. For the four-plex isobaric labeling, separate aliquots of A549 cell lysate were treated in parallel essentially as described by Ross et al.18 Stock reagents and buffers (TEAB, SDS, TCEP, MMTS and the four isobaric tagging reagents) were obtained from Applied Biosystems. Protein (100 µg) was reduced with 2.5 mM TCEP (60 °C for 1 h) and cysteine residues were blocked with 10 mM MMTS (room temperature for 15 min). Protein was precipitated with ice-cold acetone (90%, v/v) overnight at -20 °C. Protein was collected by centrifugation and the protein pellet was washed once with ice-cold 90% acetone and air-dried. Protein was suspended in buffer (5 mg/mL final, in 20 µL of 0.5 M TEAB-0.1% SDS) and digested with trypsin (porcine modified, Promega; 1:20, w/w, 40 °C for 20 h). Isobaric tagging iTRAQ reagent (1 unit in ethanol) was added directly to the protein digest (70% ethanol final) and the mixture was incubated at room temperature for 1 h. To assess protein expression at different time points, two parallel 4-plex labeling reactions (A and B) were performed with control sample labeled with the 114 reporter tag in both 4-plex reactions (Table 1). In the second replicate, 4 and 8 h time point samples in reaction A were replaced, due to lack of significant differential expression, by two biological replicates of the 72 h time point to assess an additional time point (Table 1). The labeling reactions were quenched by addition of 9 vol of 0.1% TFA in water. The four iTRAQ labeled (114, 115, 116 and 117) peptide pools were then mixed together in a 1:1:1:1 ratio and stored at -20 °C. SCX Peptide Fractionation. For the first dimension of the two-dimension chromatographic separation, an aliquot of the four-plex peptide mixture (120 µg) was applied to a SCX (sulfethyl aspartamide, PolyLC) spin column equilibrated with 10 mM KH2 phosphate, pH 4.5, 20% CH3CN. For peptide adsorption to the column and subsequent washing and elution

Temporal Proteomics and Corresponding mRNA Analysis during EMT steps, a centrifugal force of 300g was used. Excess reagent was washed from the column with 800 µL (approximately 10 column volumes) equilibration buffer. Peptides were eluted using 50 µL volumes of KCl in a stepwise gradient from 25 to 350 mM in equilibration buffer. Each set of iTRAQ labeled peptides was subjected to salt fractionation (12 salt fractions; 25 mM-350 mM) Fractions were dried in a vacuum centrifuge. Reverse Phase LC. For the second-dimension separation, peptides in SCX fractions were separated by C18 nano LC using an 1100 Series nano HPLC equipped with µWPS autosampler, 2/10 microvalve, MWD UV detector (214 nm) and Micro-FC fraction collector/spotter (Agilent). Each SCX salt step fraction was reconstituted with 43 µL of 0.1% TFA, and 40 µL of this was injected. With the valve in LOAD position, sample was injected onto a C18 cartridge (Zorbax300SB, 5 µm, 0.3 × 5 mm; Agilent), desalted with solvent C (CH3CN/H2O/TFA, 5:95:0.1) at 20 µL/min for 9 min and the effluent directed to waste. In ELUTE position, the enrichment cartridge was placed ahead of a C18 column (Zorbax300SB, 3.5 µm, 0.075 × 15 mm; Agilent) equilibrated with solvent A (CH3CN/H2O/TFA, 6.5:93.5:0.1). Peptides were eluted with a gradient of solvent B (CH3CN/H2O/ TFA, 90:10:0.1) from 6.5% B to 50% B over 90 min at a flow rate of 0.4 µL/min. Column effluent was mixed (micro Tee, Agilent) with matrix (2.5 mg/mL R-CHCA in CH3OH/isopropanol/CH3CN/H2O/acetic acid (15:30:21:33:0.6) delivered with a PHD200 infusion pump (Harvard Apparatus) at 0.9 µL/min. Fractions were spotted at 25 s intervals onto a stainless steel MALDI target plate (192 wells/plate, Applied Biosystems). Data Analysis. Mass spectra were acquired on an Applied Biosystems model 4700 Proteomics Analyzer (TOF/TOF) using 4000 Series Explorer software (v. 3.0). MS spectra from m/z 800-3500 were acquired for each fraction using 1500 laser shots. To look for less abundant proteins, the eight least intense peaks in each MS spectrum above an S/N threshold of 100 were selected for MS/MS analysis. MS/MS spectra were derived from 2200-8000 laser shots. In a reinterrogation of the target plates, MS/MS spectra of the next eight most intense peaks (if any) were acquired. Fragmentation of the labeled peptides was induced by the use of atmosphere as a collision gas with a pressure of ∼6 × 10-7 Torr and collision energy of 1 kV. For peak detection, 7 point Gaussian smoothing and an S/N of 20 were applied using cluster area S/N optimization. Concatenated peaklists were constructed using the T2D Extractor (T2DE) 3.0 developed in house. The T2DE tool extracts the peaks for each spectrum in an LCMS run from the Applied Biosystems 4700/ 4800 Oracle database and assembles them into a single concatenated mgf formatted file. For an iTRAQ experiment, this tool also extracts the reporter group areas and corrects for overlapping isotope distributions according to the certificate of analysis provided by Applied Biosystems (see Supplementary Materials and Methods S1 in Supporting Information for mathematical analysis). The corrected reporter group areas were stored in the title section of the mgf peaklists. In addition, spectral information such as spotset name, well number, and peptide mass was also saved in the title section of each peaklist. Peptide identifications were obtained using merged mgf file for each data set and were searched against a concatenated “Target-decoy” Human IPI version 3.14 database (Date of release: Jan 2006, with 57 366 target sequences, searched against a total of 114 732 sequences (target and reverse/decoy). Search algorithms Sequest (version 27, rev 12), X! Tandem (version: 2006.09.15.3), Mascot (version 2.1.0) and Phenyx (January 2007) were used to maximize the number of peptides

research articles

identified. Similar parameters for all search algorithms were used. Trypsin specificity with two missed cleavages was selected. S-mercaptomethylcysteine, and the N-terminal and lysine iTRAQ labels were selected as fixed modifications; oxidized methionines, deamidation on N/Q and tyrosine iTRAQ were considered as variable modifications. The precursor tolerance and MS/MS fragment tolerances were both set to 0.5 Da. Sequest, X!Tandem and Mascot results were independently subjected to Scaffold (version 01_06_03) analysis using individual thresholds. Phenyx is yet to be integrated into Scaffold; therefore, Phenyx searches were not subjected to Scaffold analysis. Subsequently, each search algorithm was used for reverse database analysis to calculate 1% false positive rate (FPR) at the peptide level. For FPR calculation, search results for each search algorithm were sorted in decreasing order of scores (ion score for Mascot, -(log E value) for X!Tandem, delta Cn score for Sequest and Z-score for Phenyx, respectively). Decoy (reverse sequence) hits were identified by matches to IPI numbers with reverse sequences in the “target-decoy” database. The following formula was used to calculate the aggregrate FPR: % FPR ) [2 × (number of reverse hits)/Total number of identifications] × 100). Table 2 lists decoy hits (reverse hits) for each set of spectra identified at 0.1%, 0.5% and 1% FPR. Threshold scores for each algorithm are also listed. A peptide identification was accepted only if it made the list that had e1% FPR. Peptide identifications (e1% FPR) from all search algorithms (X! tandem, Sequest, Phenyx and Mascot) from both replicates were merged and analyzed for protein identification by these search algorithms. We observed rare conflicting assignments of the same spectra to different peptides by different search algorithms. These conflicting identifications were resolved to generate a combined raw data sheet (CRDS). For a spectrum, identical peptides identified by a majority of algorithms (4 vs 0, 3 vs 1, 2 vs 1, 3 vs 0, 2 vs 0, 1 vs 0) were retained, while the rest were eliminated from the CRDS. Protein grouping of peptides was carried out using inbuilt ProteinProphet algorithm imbedded within Scaffold software. Peptide identifications from the Mascot, X!Tandem and Sequest searches were merged with quantitative information by the iTRAQtive 1.0 software tool developed in-house. While merging Mascot and X!tandem searches was straightforward, iTRAQtive was especially useful in associating dta formatted peaklists used for Sequest search with its parent mgf file. iTRAQtive associates the reporter group areas stored in the mgf file with the dta formatted peaklists used in Sequest searches. An MD5 hash is calculated for each peaklist in the mgf file and for each dta peaklist used by Sequest. It then compares the MD5 hash of the individual peaklists in the mgf formatted file against the MD5 hash of the peaklist in the dta formatted file and integrates the result of the match into the Scaffold report. The CRDS also contained corrected peak areas for iTRAQ labels, 114.1, 115.1, 116.1 and 117.1. The corrected peak areas represent quantitative measurements observed for, 114.1, 115.1, 116.1 and 117.1 m/z, which correspond to various time points as indicated in Table 1. Normalization of Data. To be able to control the normalization of the data across multiple 4-plex reactions from two independent biological replicates (Table 1), an indirect design was used by including the control in the 114 channel of every 4-plex labeling reaction.19 By first setting the average ratio of the 114 channel measurements between different experiments to 1 and subsequently using the normalization procedure Journal of Proteome Research • Vol. 8, No. 1, 2009 37

research articles

Keshamouni et al.

Table 2. Comparison of the Results from 4 Individual Search Algorithms and Their Combined Search Results after Resolving Conflictsa A

X ! TANDEM FPR

1st replicate Total spectra (15380) 2nd replicate Total spectra (12197)

X!Tandem-log(e)score

MASCOT

ID spectra

reverse hits

% IDs

Mascot ion score

ID spectra

reverse hits

% IDs

0.1% 0.5% 1.0%

2.96 1.22 1.22

7205 9129 9129

3 19 19

46.8 59.4 59.4

47.7 35.5 27.7

6487 8356 9305

3 20 46

42.2 54.3 60.5

0.1% 0.5% 1.0%

1.2 1.2 1.2

8397 8397 8397

2 2 2

68.6 68.6 68.6

40.9 32.9 25.3

6799 8064 8743

3 20 30

55.7 66 71.7

SEQUEST FPR

1st replicate Total spectra (15380) 2nd replicate Total spectra (12197)

DCn

score

ID spectra

reverse hits

% IDs

Z-score

ID spectra

reverse hits

% IDs

0.1% 0.5% 1.0%

0.247 0.199 0.16

6303 7319 8062

3 18 40

41 47.6 52.4

6.54 5.74 5

6573 7719 8478

3 19 36

42.7 50.2 55.1

0.1% 0.5% 1.0%

0.201 0.157 0.135

6876 7654 8002

3 19 39

56.4 62.8 65.6

7.88 6.66 5.67

4018 6209 7864

2 16 38

32.9 50.9 62.8

B

1st replicate 2nd replicate

PHENYX

combined raw data set total spectra

peptides identified

reversehits

15380 12197

10110 9431

121 99

FPR

2.4 2.1

% IDs

65.7 77.3

a Table 2A shows database search results for peptide identification using 4 different algorithms. For each algorithm, number of spectra assigned to peptide (ID spectra), number of spectra assigned to a peptide in a decoy database (reverse hits), and percentage of spectra assigned (%ID) at three different false positive rates (FPR), for each replicate. Corresponding X!Tandem-log(e)score for X!Tandem, or Mascot Ion score for Mascot, or DCn score for Sequest, or Z-score for Phenyx are provided. Table 2B shows total spectra, peptides identified, number of reverse hits, %IDs and the calculated FPR of the data set obtained after combining search results from 4 algorithms and conflicts resolved. This data set was used for protein identification and quantification with the condition that the peptides must have been identified by at least one search algorithm at 1% FPR.

described by Keshamouni et al.,14 we were able to make consistent comparisons across time points obtained from different iTRAQ 4-plex experiments. Normalization was accomplished by matching the quantiles of the distributions of each treatment iTRAQ reporter group (115.1, 116.1 and 117.1) to those of the control iTRAQ reporter group (114.1, which corresponded to the 0 time point). Analysis of the first biological replicate using our random effects analysis of variance model14,20 revealed very little change at 2 and 4 h time points. Specifically, only 6 and 3 proteins were detected as differentially expressed at 2 and 4 h, respectively, of 542 proteins at the 0.05 significance level. Note that, just by chance, about 27 proteins are expected to exhibit differential expression at that level of significance. We measured protein concentrations, not rates of new protein synthesis. Therefore, in the second biological replicate, we extended the time course to 72 h and replaced the 2 and 4 h time points with two 72 h biological replicates (Table 1). Analysis of Differential Protein Expression during Time Course. To determine relative ratios, we took into consideration that each protein can potentially be identified by a number of different peptides and that every unique peptide can be measured multiple times.14 Specifically, let r(i,j,k,l) denote the ratio of the corrected and normalized peak area of MS/MS spectrum k corresponding to identified peptide j for protein I, for labeled sample l ) 115, 116 and 117 (the treatments), by the corresponding corrected and normalized peak area for the control sample 114. We drop the indices i and l, since the 38

Journal of Proteome Research • Vol. 8, No. 1, 2009

proposed model was applied to each protein separately and to all the treatment conditions. These ratios are modeled on the log2 scale, in order to deal with a variable that has unbounded support and is symmetric around 0. Then, we posit the following random one-way analysis of variance model: log2(r(j,k)) ) R + R(j) + u(j,k), where R corresponds to the relative abundance of protein in the labeled samples, R(j) is a peptide specific effect and u(j,k), an MS/MS spectrum effect. The last two effects are assumed to be normally distributed with mean zero and constant variances, s2 and t2, respectively, both being treated as random components to be estimated by the data. In summary, the model used accounts for the variability in the observed MS/MS measurements both at the MS/MS level and at the peptide level, by modeling every observed relative peak area ratio as a protein level overall ratio, plus a peptide specific value and an individual MS/MS spectrum perturbation. It also allows us to formally test the hypothesis that the overall protein ratio R is different from 0 in the log2 scale, which translates to the hypothesis of being different from 1 in the original ratio scale. Proteins with P-values for this hypothesis less than or equal to 0.05 were considered statistically significant. Only statistically significant differentially expressed proteins that also exhibited an increase or decrease greater or equal than 20% (in the original ratio scale) were selected as overall significant. The 20% cutoff is based on our extensive experience with iTRAQ data and on our

Temporal Proteomics and Corresponding mRNA Analysis during EMT experimental validation by Western blotting for proteins exhibiting differential expression as low as 23%.14 Time Course mRNA Expression Analysis. A549 cells were stimulated with 5 ng/mL of TGF-β, then were harvested at 0, 0.5, 1, 2, 4, 8, 16, 24, and 72 h after TGF-β stimulation: total RNA was isolated. From three biological replicates, RNA transcripts were assayed using Affymetrix HG-U133_plus_2 arrays containing 54 675 probe-sets, representing approximately 20 000 distinct genes. Array Processing. Probe-set intensities were obtained using publicly available software developed at the University of Michigan.21,22 Briefly, probe-set intensities were obtained as the average of perfect match minus mismatch probe values after discarding the top and bottom 20% of the values for each probe-set on each array. A time ‘0’ sample was selected as the standard for normalization, and was scaled to give average probe-set intensity of 1000 units. Other arrays were normalized to the standard by transforming with a piecewise linear function that made 99 evenly spaced quantiles (at 0.01, 0.02,..., 0.99) agree with the quantiles of the standard. Data were logtransformed using log(max(x + 50,0) + 50). Fold-changes were estimated as antilog of the average differences for the logtransformed data. Analysis of Differential Gene Expression. Two-way ANOVA models with effects for 3 experiments and 9 time points were fit to the data for each probe-set. We compared each time point with the 0 h time point, and counted probe-sets for which the p-value for the comparison was smaller than 0.01. Entrez gene identifiers corresponding to the IPI accession numbers of identified proteins were obtained using human IPI Cross Reference data, and probe-sets from the arrays associated with these using gene identifiers from Affymetrix annotation (dated 11 July, 2007). In cases where a gene was represented by more than one probe-set, we report data from the probe-set that had smallest p-value for the overall F-test testing if there were no differences between any time points. Assessment of mRNA Stability. A549 cells after 24 h serum starvation were cultured in presence and absence of 5 ng/mL of TGF-β. After 24 h TGF-β treatment, cellular transcription was inhibited by adding 10 µg/mL of Actinomycin D (Sigma-A9415) and cells were harvested every 2 h for the next 10 h. Total RNA was extracted using TRIzol reagent (Invitrogen). mRNA levels were quantitated by QRT-PCR against GAPDH control. Percent remaining mRNA was calculated after addition of Actinomycin D both in the presence and absence of TGF-β and the data were subjected to linear regression analysis to generate mRNA decay curves using SigmaPlot. The following primers and probes were used in one-step Taqman PCR reactions: for β-actin, Forward, 5′-TCCAGCAGATGTGGATCAGC; Reverse, 5′CTAGAAGCATTTGCGGTGGAC; probe, 5′-6FAMCAGGAGTATGACGAGTCCGGCCCCTAM-3); for cofilin-1, Forward, 5′-TCTCGGTGCCCTCTCCTTTT;Reverse,5′-TGACACCATCAGAGACAGCCA; probe, 5′-6-FAMTCCGGAAACATGGCCTCCGGTTAM-3′. The thermal cycling conditions employed were 48 °C for 30 min for RT step, 95 °C for 10 min for AmpliTaq gold activation step, denaturation for 95 °C for 15 s and annealing/extension (60 °C for 1 min for 40 cycles) using the ABI sequence detection system.

research articles

Results Identification and Quantitation of Temporal Protein Expression During EMT Using Multiplex iTRAQ Labeling and 2D-LC-MS/MS Analysis. Stimulation of A549, lung adenocarcinoma cells with TGF-β induces EMT.14 A549 cells, grown to 50-60% confluence and serum-starved for 24 h, were treated with 5 ng/mL of TGF-β. Total cell lysates were prepared at indicated time points in RIPA lysis buffer and stored at -80 °C until further analysis. The EMT was confirmed by assessing time-dependent E-cadherin downregulation and vimentin upregulation, using Western blotting (as in ref 14 data not shown). To assess temporal, quantitative protein expression during EMT, A549 cell lysates from two biological replicates were subjected to iTRAQ labeling (Table 1), followed by 2D-LC-MS/ MS analysis as described under Experimental Procedures. Proteins were identified on the basis of having at least one distinct peptide with threshold of 1% false positive rate using the reverse database search. In total (including both biological replicates), 590 distinct proteins were identified for 8 and 72 h time points, and 586 proteins were identified for 16, 24 and 48 h time points. These numbers for protein identification were from 19 541 peptide sequences deduced from 27 577 MS/MS spectra. All protein identifications made on single distinct peptides were discarded, leaving 255 proteins from 8 and 72 h time points and 251 proteins from 16, 24 and 48 h time points, respectively. These protein numbers are based on two or more distinct peptide identifications, which we used for further analysis of differential expression. The details of protein identifications in both biological replicates are presented in Table 2. The data were normalized as described under Experimental Procedures to compensate for differences in sample preparation, so that all four peak area measurements have similar statistics. To infer up- and downregulation of proteins, the peptide data were aggregated on the basis of the combined search results from all four search algorithms to yield ratios for proteins, using the ANOVA model described in Experimental Procedures. For quantitative analysis, both biological replicates were analyzed separately as well as together, so that reproducibility of the data set could be analyzed. The use of multiple search algorithms increases the number of peptides and proteins identified (Table 2). In particular, for the first replicate, combining results from all search algorithms (10 110 peptides identified at 2.4% FPR) resulted in a 25% increase over peptides identified by Sequest (8062 peptides) and 8% increase over peptides identified by Mascot (9305 peptides). Similarly, for the second replicate, combining results from all search algorithms (9431 peptides identified at 2.1% FPR) resulted in a 20% increase over peptides identified by Phenyx (7864 peptides) and 10% increase over peptides identified by Mascot (8743 peptides). In addition, combining search results from multiple algorithms resulted in identifications of several proteins with either more number of total peptides or unique peptides or both. This increased the sample size and allowed more reliable peptide-based quantification. Table 3 shows 66 proteins that were identified as differentially expressed at least at one time point assessed, with a p-value less than 0.05 for statistical significance and a threshold of 20% (ratios >1.20 or 1.2 or