Quantitative Proteome and Transcriptome Analysis of the Archaeon

Jul 29, 2010 - Comparison of the transcriptome and proteome showed only a weak positive ... Keywords: transcriptome and label-free quantitative proteo...
0 downloads 0 Views 474KB Size
Quantitative Proteome and Transcriptome Analysis of the Archaeon Thermoplasma acidophilum Cultured under Aerobic and Anaerobic Conditions Na Sun,† Cuiping Pan,‡ Stephan Nickell,† Matthias Mann,‡ Wolfgang Baumeister,† and Istva´n Nagy*,† Department of Molecular Structural Biology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, D-82152 Martinsried, Germany, and Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Am Klopferspitz 18, D-82152 Martinsried, Germany Received June 7, 2010

A comparative proteome and transcriptome analysis of Thermoplasma acidophilum cultured under aerobic and anaerobic conditions has been performed. One-thousand twenty-five proteins were identified covering 88% of the cytosolic proteome. Using a label-free quantitation method, we found that approximately one-quarter of the identified proteome (263 proteins) were significantly induced (>2 fold) under anaerobic conditions. Thirty-nine macromolecular complexes were identified, of which 28 were quantified and 15 were regulated under anaerobiosis. In parallel, a whole genome cDNA microarray analysis was performed showing that the expression levels of 445 genes were influenced by the absence of oxygen. Interestingly, more than 40% of the membrane protein-encoding genes (145 out of 335 ORFs) were up- or down-regulated at the mRNA level. Many of these proteins are functionally associated with extracellular protein or peptide degradation or ion and amino acid transport. Comparison of the transcriptome and proteome showed only a weak positive correlation between mRNA and protein expression changes, which is indicative of extensive post-transcriptional regulatory mechanisms in T. acidophilum. Integration of transcriptomics and proteomics data generated hypotheses for physiological adaptations of the cells to anaerobiosis, and the quantitative proteomics data together with quantitative analysis of protein complexes provide a platform for correlation of MS-based proteomics studies with cryo-electron tomography-based visual proteomics approaches. Keywords: transcriptome and label-free quantitative proteome analysis • aerobic and anaerobic conditions • Thermoplasma acidophilum

Introduction Features like a small genome, a small cell size, and the lack of a rigid cell wall make Thermoplasma acidophilum an attractive model organism for “visual proteomics” studies aiming at providing a comprehensive cellular atlas of macromolecular complexes using cryo-electron tomography (cryoET) and pattern recognition methods.1 Once a comprehensive atlas is established, it could be used to deduce a molecular interaction network depicting the molecular sociology of the cell. For undertaking such a project, a detailed analysis of the molecular inventory of the organism under scrutiny is a prerequisite. T. acidophilum is a thermoacidophilic archaeon that grows optimally at 55-60 °C and pH 0.5-4. When it grows (micro)aerobically, it uses O2 as a terminal electron acceptor, but when it is cultured anaerobically, elemental sulfur (S0) is the terminal electron acceptor.2 This physiological versatility together with the relative simplicity of the cell provides a good * To whom correspondence should be addressed. Tel. +49 (89) 8578 2044. Fax: +49 (89) 8578 2641. E-mail: [email protected]. † Department of Molecular Structural Biology. ‡ Department of Proteomics and Signal Transduction. 10.1021/pr100567u

 2010 American Chemical Society

platform to study the oxygen-induced changes at both protein and mRNA expression levels. Furthermore, quantitative proteomics data can be used as a cross-reference to validate cryoET pattern recognition results obtained for T. acidophilum cells. Knowledge about the protein abundance and changes induced by the physiological conditions are important prerequisites for establishing quantitative predictive models of cellular behavior. Most of the quantitative proteomics studies use metabolic or chemical labeling methods, such as stable isotope labeling by amino acids in cell culture (SILAC),3 14N/15N labeling,4 18O labeling,5 isotope-coded affinity tag (ICAT),6 isotope-coded protein label (ICPL),7 and isobaric tags for relative and absolute quantitation (iTRAQ).8 Although these methods are usually very accurate and robust, they suffer from rather high costs, complex experimental procedures, and incomplete labeling efficiencies. Traditional mass spectrometric strategies derive the precise information of copy numbers by introducing internal standards, typically by spiking isotopically labeled internal standard peptides into the sample pool.9 However, this has limitation due to expenses and difficulty of synthesizing external reference peptides. Journal of Proteome Research 2010, 9, 4839–4850 4839 Published on Web 07/29/2010

research articles Alternatively, label-free quantitation methods can be used that compare normalized signal intensities of proteolytic peptides to determine changes in protein abundances. Due to its advantages such as being less expensive and more straightforward, label-free quantitation is used more and more commonly.10,11 As samples are processed separately in this approach, even minor differences in upstream sample preparation can accumulate and potentially cause a significant deviation in the signal intensities recorded by sensitive mass spectrometric detectors. Therefore, data accuracy and reproducibility are major concerns in label-free approaches. Here, we used a recently developed label-free quantitation algorithm,12 in which delayed normalization procedures are performed and maximum ratio information from peptide signals across samples are retrieved, to analyze changes in the proteome of T. acidophilum induced by aerobic and anaerobic culture conditions. To estimate the protein abundance, we performed a proteome-wide estimation taking into account the peptide ion intensities and protein lengths, thereby averaging out intensity variations caused by protein size and sequence variations.13 Furthermore, complementary to the proteomics studies, a cDNA microarray-assisted transcriptomics analysis was carried out to provide a genome-wide portrait of the transcriptome, including both cytosolic and membrane-associated proteins.

Experimental Procedures Protein Extract Preparation from T. acidophilum Cells Cultured under Aerobic and Anaerobic Conditions. T. acidophilum DSM 1728 was cultured as described earlier with minor modification.14 For aerobic cultures, pH was adjusted to 1.6 with H2SO4 (96%). Five-hundred millilters of medium in a 1000 mL flask was inoculated with 2% starter culture and it was shaken at 120 rpm at 59 °C. Cell growth was stopped at the late-exponential growth phase (2 days). For anaerobic cultures, cells were grown in medium, which was modified as follows: the pH was adjusted to 2.0 with H2SO4 (96%), 0.4% (w/v) S0 was added, the medium was heated at 100 °C for 30 min to sterilize the sulfur, and oxygen content was reduced by adding 1.2 g/L Na2S · xH2O (x ) 7-9). Cell cultures were placed into a plastic beaker - to prevent corrosion - that was positioned in a Parr bomb 1.2 L metal container. The gas phase was filled with 80% N2/20% CO2 (pressure 2.5 kp/cm2) and the culture was agitated continuously at 300 rpm with a magnetic stirrer. Cells were grown at 59 °C for 4 days until late-exponential phase.14 Before harvest, the culture was shaken briefly to loosen the cells attached to sulfur particles, after which S0 was removed by using a glass filter with a pore size of 16-40 µm. From the filtrate, cells were harvested by centrifugation at 4000× g for 10 min at 4 °C, and then washed with demineralized water (pH 4) and stored at -80 °C before lysis. The preparation of the cell extract was same as described previously.15 Three independent replicates were prepared for both aerobic and anaerobic conditions. 1D-SDS-PAGE and In-Gel Digest. Protein concentrations were measured using the Bio-Rad Protein Assay kit (BIO-RAD). Protein samples (20 µg of total protein) of each aerobic and anaerobic extracts were separated on a 4-12% NuPage Novex Bis-Tris gel (Invitrogen) and stained using the Colloidal Blue Staining kit (Invitrogen). Each of the gel lanes was cut into 15 slices. To facilitate protein identification, the intensely stained protein bands were separated from the weakly stained ones. 4840

Journal of Proteome Research • Vol. 9, No. 9, 2010

Sun et al. The in-gel protein digestion by trypsin was performed according to Shevchenko et al.16 The digested peptide extracts were concentrated in a SpeedVac to 10-20% of the original volume to remove acetonitrile, followed by peptide purification using C18 StageTips.17 LC-MS/MS. The LC-MS/MS analysis was performed similar to Pan et al. (2009).18 A nanoscale C18 reverse-phase liquid chromatography (Proxeon Biosystems, Odense, Denmark) was coupled online to a 7-T LTQ-FTICR-MS (Thermo Electron, Bremen, Germany) via a nanoelectrospray ion source (Proxeon Biosystems, Odense, Denmark).19 Samples were loaded by autosampler onto a 15 cm fused silica emitter with 75 µm inner diameter (Proxeon Biosystems) packed in-house with methanol slurry of Reprosil-Pur C18-AQ 3 µm reverse phase resin (Dr. Maisch GmbH, Ammerbuch-Entrigen, Germany). Over the next 90 min, peptides were eluted at 250 nL/min flow rate with an actual separating gradient of 2-40% acetonitrile in 0.5% acetic acid and injected into the mass spectrometer. The mass spectrometer was operated in positive ion mode and employed a data-dependent automatic switch between MS and MS/MS acquisition modes. After accumulating a target value of 5 000 000 ions in linear quadrupole ion trap (LTQ), a full scan was acquired in the Fourier transform ion cyclotron resonance (FTICR) analyzer with a resolution of 100 000 at m/z 400. The five most intense ions from the range 300-1800 m/z were sequentially accumulated and fragmented in the LTQ, with a target value of 5000 for each ion species.19 The MS/MS fragmentation was performed by collision-induced fragmentation (CID). Total cycle time was approximately 3 s. Former target ions selected for MS/MS were dynamically excluded for 30 s. The general parameters were spray voltage, 2.2 kV; no sheath and auxiliary gas flow; ion transfer tube temperature, 175 °C; normalized collision energy using wide-band activation mode, 35% for MS/MS. Ion selection thresholds were 500 counts for MS/MS. An activation q of 0.25 and activation time of 30 ms was applied in MS/MS acquisitions. Data were acquired using the Xcalibur software. Mass Spectrometry Data Analysis. Mass spectra were analyzed using the in-house developed software MaxQuant (version 1.0.12.12), which performs peak list generation for protein database search and statistically evaluates protein identification and quantitation results based on computational algorithms from Cox and Mann.20 Our data were searched against the protein sequence database comprising direct and reverse sequences of 1482 entries derived from the T. acidophilum DSM 1728 genome database (Protein Extraction, Description and ANalysis Tool [PEDANT] database, http://pedant.gsf.de) using Mascot Daemon (Version 2.1.0, Matrix Science21). Enzyme specificity was set to trypsin, allowing for cleavage N-terminal to proline and between aspartic acid and proline.19 Carbamidomethyl cysteine was set as fixed modification, whereas oxidation of methionine, and N-acetylation were set as variable modifications. Missed cleavages were allowed up to three. Initial mass deviation of precursor ion and fragment ions were up to 10 ppm and 0.5 Da, respectively. To pass statistical evaluation, posterior error probability for peptide identification (MS/MS spectra) must be below or equal to 0.1. The false positive rate was set to 1% at peptide and protein levels. Posterior error probability for peptides was calculated by recording Mascot score and peptide sequence lengthdependent histograms of forward and reverse hits separately and then, using Bayes theorem, deriving the probability of a

research articles

Proteome and Transcriptome Analysis of T. acidophilum false identification for a given top scoring peptide. False discovery rate was calculated by successively including best scoring peptide hits until the list contained 1% reverse hits. The minimum peptide length was set to 6. This acceptance criterion was subsequently strengthened to a minimum of two unique identified peptides per protein for compilation of the final list. Relative Label-Free Quantitation. The MaxQuant software was used for label-free quantitation analysis, which was based on extracted ion currents (XICs) of peptides. It contains algorithms for retention time alignment, transferring identifications between runs, normalization of intensities, and protein quantitation.12 A brief description of the specific label-free processing was described by Waanders et al.22 A 2-fold change was used to define biological regulation. The coefficient of variation (CV) and t test were used to perform statistical evaluations. The CV is defined as the ratio of the standard deviation to the mean, which represents the interexperimental variations of experimental replicates. The t test was used to calculate the significances in expression level changes of the regulated proteins. The statistical criteria defined for the label-free quantitation in this study are the following: For the proteins within 2-fold change (0.5 < Ratioanae/ae < 2), CV tests were used to judge the experimental variations. CVs of normalized intensities for proteins detected in 3 aerobic and 3 anaerobic replicates were calculated, respectively. Proteins with 0.5 < Ratioanae/ae < 2 and CVs smaller than 50% both under aerobic and anaerobic conditions are clustered as unchanged proteins. For the proteins with more than 2-fold change (Ratioanae/ae g 2 or Ratioanae/ae e 0.5) the p-value of the t test was set with a cutoff of 0.05 to give significant quantitation data with 95% confidence. Proteins with Ratioanae/ae g 2 or Ratioanae/ae e 0.5 and p e 0.05 are defined as regulated proteins. Protein Abundance Estimation. The abundance of each peptide was estimated from the area under the extracted ion current peak after 3D reconstruction, and the abundance of protein expressed was then calculated based on the sum of the abundance of identified peptides divided by the protein sequence length.13 Microarray Analysis. Total RNA samples from T. acidophilum cells cultured under aerobic and anaerobic conditions were isolated with the RNeasy Protect Bacteria Kit (Qiagen) at the same time points as the protein samples were extracted (2 and 4 days), respectively. The transcriptomics analysis was performed on TI273075 60mer chips of Roche NimbleGen microarrays (NimbleGen Systems of Iceland, LLC). Probes were selected for all protein sequences (1482) and labeled with Cy3. The median number of probes per sequence is 20, and each probe is replicated 5 times on the chip. The probes are randomly distributed over the surface of the array. Unused features are filled with randomly generated probes of comparable GC content. ArrayStar v2.0 software (DNASTAR, Inc.) was used for the data analysis. Three independent biological replicates were processed for aerobic and anaerobic conditions, respectively. The statistic analysis was based on Student’s t test. Similar to the proteomics data set, a 2-fold cut off was set as an indicator of significant biological regulation. The Student’s t test p-value of 0.05 was set as a cut off for significant quantitations. Bioinformatics Tools. Database searches were carried out using the BLAST algorithm at the National Center for Biotech-

23,24

nology Information (NCBI), and the Conserved Domain Database (CDD, http://www.ncbi.nlm.nih.gov/Structure/cdd/ wrpsb.cgi) was used to obtain general information concerning the biological function of the identified proteins.25 To position the proteins in biochemical pathways, the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/ dbget-bin/www_bfind?genes) database was searched.26 Munich Information Center for protein sequences (MIPS) database was used to study the functional categories (http://mips.gsf.de/). SOSUI (http://bp.nuap.nagoya-u.ac.jp/sosui/) analysis was performed for classification and secondary structure prediction for membrane proteins.27 Cytoscape along with its Plug-in Bingo 2.0 was used to analyze the distribution of experimental data sets among various protein groups and to identify significantly overrepresented biological functions of the proteins.28 The Gene Ontology (GO) annotations of proteins were compared with the ones from a reference proteome (e.g., identified proteins vs the entire protein database, or a subset of the identified proteins vs the overall identified proteins). To assign corresponding GO identifiers to each protein entry, the PEDANT database (http://pedant.gsf.de) was used. The hypergeometric test and the Benjamini and Hochberg False Discovery Rate correction were performed to derive overrepresented functions,28 and a probability value of 0.05 was considered significant. The proteomics and transcriptomics data are deposited at the PRIDE database (http://www.ebi.ac.uk/pride/) under accession number 13083 and GEO database (http://www.ncbi. nlm.nih.gov/geo/info/linking.html) under accession number: GSE21956, respectively.

Results and Discussion Identification of Proteins Expressed under Aerobic and Anaerobic Conditions. The cytosolic proteome of aerobically and anaerobically grown T. acidophilum were investigated with a combination of one-dimensional (1D) gel electrophoresis and LTQ-FTICR-MS. In previous studies, protein expression profiles of cytosolic proteome of aerobically grown T. acidophilum were generated by using approaches of two-dimensional gel electrophoresis (2DE) coupled to MALDI-TOF-MS (2DE-MALDITOF-MS)29 and molecular size chromatography combined with LTQ mass spectrometer (MSC-LC-MS/MS).15 However, due to the limited dynamic range of 2DE-MS and relatively low mass accuracy of the LTQ mass spectrometer, only 20 and 40% of the cytosolic proteome were identified, respectively.15,29 In this study we substantially increased the number of identified proteins (especially the low abundant ones). A total of 1025 proteins were identified with an average absolute mass error of 0.54 ppm. Of these, 1004 of which are confirmed by the PEDANT database to be cytosolic proteins, covering 88% of the entire cytosolic proteome (1146 ORFs) (Table S1, Supporting Information). The functional distribution of these proteins is very similar to that of the genome-predicted cytosolic proteome (Figure 1), suggesting that the experimental proteome analysis is comprehensive, with no evident bias for or against specific protein classes. Comparing the aerobic and anaerobic proteome, we found that 99% (1014) of the detected proteins were present under both culture conditions, with only 9 proteins unique to aerobic (Ta0171, Ta0297, Ta0371, Ta0448, Ta0606, Ta0938, Ta1093a, Ta1356, Ta1403) and 2 to anaerobic cells (Ta0375a, Ta0627). Most of them are annotated as hypothetical proteins, that is, has unknown functions. Those aerobically expressed proteins Journal of Proteome Research • Vol. 9, No. 9, 2010 4841

research articles

Figure 1. Functional distribution of identified cytosolic proteins vs theoretical cytosolic genome. Percentages of undetected proteins for each functional category are indicated above the columns. The MIPS database server was used for automated categorization.

which had similarity to known proteins could be assigned to diverse groups of biological functions being a probable DEAD box protein (Ta0297) responsible for RNA unwinding, a heme biosynthesis protein (Ta1356) of the radical SAM superfamily and thermopsin (Ta1403) an archaeal protease. Ta0627, detected only under anaerobic conditions, is a subunit of the pyruvate synthase complex (Ta0626-Ta0629). These 11 proteins had low expression levels with the majority of their abundances in the range of 102 to 103.6 arbitrary units (AU) on a 101 to 106 scale (see “Protein abundance estimation” in the Experimental Procedures section and “Protein abundance profile” in the following), while their transcription level was highly diverse having values even among the lowest and highest intensities, but the direction of change (up or down regulation) of the transcripts and proteins were coinciding (Tables S1, S2, Supporting Information). Twelve percent of the annotated cytosolic proteins, mostly being hypothetical proteins with unknown function(s), escaped detection. This could be due to their low molecular mass and/ or low abundance level if they were expressed at all. The MSbased proteomics has limitations in sensitivity and dynamic range.30 For proteins with low molecular weight and/or low abundances, much lower amounts of tryptic peptides can be generated and other highly abundant peptides will thus mask their presence and therefore suppress their identification. Another possible reason is that proteins might adhere to the cell membrane or membrane-anchored proteins and are therefore not contained in the cytosolic sample. Reproducibility and Accuracy of Label-Free Quantitation Data. The coefficient of variation (CV) value, which expresses the standard deviation in percentage of the sample mean, was used to evaluate the reproducibility of label-free quantitation data. Figure S1 (Supporting Information) displays the histogram of CVs of normalized intensities for proteins identified in three experimental replicates under aerobic and anaerobic conditions, respectively. The majority of the data (CV < 70%) fit a Gaussian distribution. The median and mean CV values are 29.0 and 35.8% for aerobic and 35.1 and 41.9% for anaerobic samples, respectively, indicating a satisfactory degree of preci4842

Journal of Proteome Research • Vol. 9, No. 9, 2010

Sun et al. sion of the label-free quantitation experiments. The CV value of 50% is considered as a reasonable cutoff for experimental variation. Quantitation of Proteins Expressed under the Investigated Conditions. To derive high quality data sets for further analyses, we examined the quantitation data at the levels of both biological variation and statistical deviation. For defining regulation, a minimum of 2-fold expression change is required to be accepted as biological change (Ratioanae/ae e 0.5, or Ratioanae/ae g 2), and statistically the data point should be at the outlier of the whole data set (p-value from student t test e0.05). For defining no change, the expression change should be within 2-fold (0.5 < Ratioanae/ae < 2) and statistically the deviation should be small (CV e 50%). Using these criteria, we established a robust quantitation data set consisting of 604 quantified proteins, of which 341 remained unchanged and 263 were regulated, i.e. 146 proteins were down-regulated and 117 were up-regulated (Table S1, Supporting Information) under anaerobic conditions. Interestingly, 89 of the regulated proteins are hypothetical proteins. The number of the expressed proteins at both conditions was high and almost identical, but with respect to the quantitation results, anaerobiosis caused significant alterations in protein expression levels. These results indicate that for the maintenance of normal cell growth and homeostasis, almost all proteins are expressed and required at a certain level. This strategy prepares the cells for all environmental scenarios it can cope with. In case of a sudden change in the environment, cells can make use of their protein repository to respond to the challenges promptly and minimize potential damages. Also it buffers the cells in the adaptation period and allows enough time for adjusting expression of those required proteins to reestablish a new homeostasis. Whether the high number of expressed proteins in T. acidophilum cells is specific only for organisms with small genome size and relatively low complexity proteome or it is common also in cells (prokaryotic and eukaryotic) with larger genome and more complex proteome remains elusive. It well might be that subset(s) of proteins may not be required under particular conditions in these cells, but it is also possible that with even more sensitive methods/ instrumentation the detection of low copy number proteins will be increased. Protein Abundance Profiles. Protein abundance analysis (log10 scale) was carried out to compare the effectiveness of protein identification rate in the LTQ-FT-MS/MS experiment with the 2DE-MALDI-TOF-MS and MSC-LC-MS/MS methods (Figure 2). Arbitrarily, we set the value of log10 sum of the abundance of identified peptides/protein sequence length g5.5 as a cut off for high abundant, 4.6-5.5 for intermediate, 3.7-4.6 for low abundant, and e3.7 for very low abundant proteins. With the 2DE-MALDI-TOF-MS and MSC-LC-MS/MS methods, we identified mostly of high and intermediate proteins, whereas with the highly sensitive LTQ-FT-MS/MS method, the identification rate of low/very low abundant proteins was extended substantially. Identification and Quantitation Results of the Transcriptome Analysis. Transcriptomics data for the whole genome mRNA expression (1482 genes) were obtained with high reliability, with the mean, median and standard deviation of CVs from three independent replications being 6.7, 5.7, and 4.7%, respectively. Quantitation results revealed that 445 genes exhibited at least 2-fold regulation, among which 205 are annotated as hypothetical proteins. 300 of the 445 regulated

Proteome and Transcriptome Analysis of T. acidophilum

research articles

Figure 2. Distributions of protein abundances (log10). Black bar and black curve indicate frequency and cumulative percentage of the 1025 proteins identified with the LTQ-FT-MS/MS method. Gray bar and gray curve show frequency and cumulative percentage of the 466 proteins identified with the MSC-LC-MS/ MS method.15 Light gray bar and light gray curve represent frequency and cumulative percentage of the 271 proteins identified with the 2DE-MALDI-TOF-MS method.29

genes encodes for cytosolic and 145 for membrane proteins. Ninety of the cytosolic genes were up-regulated and 210 were down-regulated under anaerobic condition. The 145 regulated membrane protein encoding genes cover 43% of the membrane genome (335 ORFs), and the majority of these genes (130) were more than 2-fold down-regulated under anaerobic condition (Table S2, Supporting Information). Many of these membrane genes are functionally associated with ion transport, amino acid transport, or ATPase and endopeptidase activities. Reproducibility of the Transcriptomics Data. The hierarchical clustering of the microarray data representing the expression level of 1482 genes is displayed in Figure S2 (Supporting Information). High similarities of the heat map views of the three aerobic and the three anaerobic measurements, respectively, indicate excellent reproducibility of the microarray experiment. Correlation of Transcriptomics and Proteomics Data Sets. Our comparative quantitation analysis revealed that the anaerobic growth condition caused extensive expression changes (∼30%) at both protein and mRNA levels. Figure 3A illustrates a scatter plot comparing mRNA and protein expression changes for the 604 common identification entries, which can be assigned to both quantitative proteomics and transcriptomics data. The Pearson correlation coefficient of these two data sets is 0.57, indicating a weak positive correlation between mRNA and protein expression ratios. The slope of the regression curve for transcript ratios versus protein ratios is 0.32, showing that the general changes at the transcript level are suppressed 3-fold compared to the changes at the proteome level. A Venn diagram of regulated cytosolic mRNA versus regulated proteins shows an overlap of ∼30% (97 genes) (Figure 3B), indicating that a large number of genes are solely regulated either at mRNA or protein level. It is commonly assumed that mRNA abundance is a surrogate for protein amounts. However, our results indicate that this assumption should be treated with caution, as only a low correlation (0.57) between mRNA and protein level changes were found in our study and the overlap between regulated proteins and mRNAs is limited to 30%. Similar findings were reported recently in mouse ESCs,31 Drosophila32 and yeast,33,34 indicating that the poor agreements between transcriptome

Figure 3. Correlation between mRNA and protein expression ratios. (A) Correlation analysis of the 604 common genes for which both protein ratios and mRNA ratios was determined. The log2 values of mRNA and protein expression ratios were calculated and plotted. The linear regression curve is shown as a solid line. The plot indicates low correlation with a Pearson correlation coefficient of 0.57. (B) Regulated cytosolic mRNAs vs proteins. Overall analysis of regulated cytosolic genes at protein and mRNA level showed an overlap of ∼30%.

and proteome data are general. In case of T. acidophilum, the corresponding data might indicate that this organism utilizes post-translational mechanisms to alter protein abundances that might involve protein turnover as well as translational regulation. These processes together with a highly probable complexity in the transcriptome (which is described for the genomereduced bacterium Mycobacterium pneumoniae35 and probably occurs in any other prokaryotes as well) might explain the low correlation between proteome and transcriptome data. Analysis of the Expression Level of Macromolecular Complexes with Molecular Weight over 300 kDa. In our previous study using MSC-LC-MS/MS approach, we identified 35 protein complexes with indicator masses higher than 300 kDa. Proteins of that size are primary targets for visual proteomics studies.15 Bioinformatic analysis of the regulated proteins identified 4 additional putative macromolecular complexes, that is, alphaglucosidase (Ta0298, 480 kDa36), galactonate dehydratase (Ta0085m, 350 kDa37), a homologue of glutamate synthase beta subunit (Ta0414, 800 kDa38) and a homologue of protoporphyrin IX magnesium chelatase subunits D and I (Ta0576, 300 kDa,39). Quantitation ratios showed that 12 of the complexes were up-regulated, 3 were down-regulated, and 11 showed no significant changes under anaerobic condition. Eleven complexes were not quantified because of large experimental Journal of Proteome Research • Vol. 9, No. 9, 2010 4843

research articles variances. Generally, subunits belonging to the same complex showed consistent regulation tendencies (Table S1, S3, Supporting Information). The regulation patterns of those complexes whose genes are organized in operons were consistent both at protein and mRNA level, for example, Ta0326-Ta0328 (FixABCX), Ta0259m-Ta0260, Ta1412-Ta1413, Ta0423-Ta0425, and Ta1292-Ta1294m (exosome), providing evidence for the accuracy of the proteomics and trancriptomics quantitation. Subunit proteins of the chromosome segregation protein complex (Ta0157 and Ta0158) and the ribosome showed different quantitation ratios. Therefore, the structure of these complexes should be investigated experimentally, to prove whether they possess different structures adapted to aerobic or anaerobic conditions. To explore this further, ribosomes were purified from aerobically and anaerobically cultured T. acidophilum cells and quantified with the same label-free quantitation method used in this study (data not shown). The results did not reveal different expression levels of ribosomal proteins. In fact, all ribosomal proteins quantified showed ratios around 1, suggesting that the ribosome remains invariable in composition in both aerobic and anaerobic cells. It well might be that differences in subunit protein ratios detected by the proteome analysis are originating from differences in protein turnover or their free subunits exist in the cytosol perhaps serving additional function(s). Gene Ontology (GO) Analysis Regarding General Molecular Functions and Biological Processes. GO analysis of the regulated cytosolic mRNAs (300) and proteins (263) uncovered a similar overrepresentation of several biochemical pathways including carbohydrate metabolism, TCA intermediate metabolism, generation of precursor metabolites and energy under anaerobic, and cobalamin (vitamin B12) biosynthesis under aerobic condition. Analysis of the Regulated Genes on the Basis of Selected Biochemical Pathways. Interestingly, many of the protein complexes (14) are involved in the main proteolytic, protein biosynthesis and protein fate processes, proton transport pathways, and glucose metabolism including the nonphosphorylated Entner-Doudoroff (ED) pathway, EmbdenMeyerhof-Parnas (EMP) pathway, and tricarboxylic acid (TCA) cycle; therefore, these biochemical pathways were analyzed in more detail (Figures 4, 5, and 6). Polypeptide Uptake and Degradation. A large set of proteins to be involved in extracellular polypeptide uptake-transport systems (12 proteins) and 5 membrane anchored extracellular proteases showed up-regulation at the transcript level under aerobic condition. Additionally, the BLAST sequence analysis of 72 hypothetical membrane protein encoding genes, which were strongly up-regulated (up to 100-fold) under aerobic condition at the mRNA level, showed that 20 of them have predicted functions related to protein, oligopeptide and amino acid transport and protein degrading activities. Hence, 15 hypothetical membrane transporters and 3 membrane proteases newly identified by the BLAST analysis are also shown in Figure 4. The proteomics data of these membrane proteins are missing because only cytosolic proteins were analyzed in our study. The finding that many of the protein, oligopeptide, and amino acid transporter and membrane-associated endopeptidases that were down-regulated under anaerobiosis, together with the up-regulated glucose metabolism in anaerobically grown cells, can indicate that there is substrate preference of cells growing aerobically or anaerobically. Aerobic cells might 4844

Journal of Proteome Research • Vol. 9, No. 9, 2010

Sun et al. prefer to use more peptides, oligopeptides and proteins than anaerobically growing cells, while the latter ones use more glucose to compensate for low energy yield. Glucose Metabolism. The nonphosphorylated ED pathway, the EMP pathway, and the TCA cycle are assigned to the central metabolic pathways of T. acidophilum.40,41 Generally, the nonphosphorylated ED pathway, which carries out glucose degradation in T. acidophilum, showed more than 3-fold upregulation under anaerobic condition based on the proteomics data, except for Ta0619 (2-keto-3-deoxy gluconate aldolase) and Ta0453m (glycerate kinase) whose ratios remained unchanged. The complete set of TCA cycle enzymes was identified in our proteomics study and they showed 2- to 9-fold up-regulation under anaerobic condition, except for succinate dehydrogenase for which we have no quantitation data (being membrane bound protein). Remarkably, the lower branch of the EMP pathway enzymes, converting glyceraldehyde 3-phosphate to pyruvate, showed up-regulation in the absence of oxygen, whereas the enzymes catalyzing the six-carbon compound transformations in the upper branch of the EMP pathway remained unchanged. Compared to the strong up-regulation of the enzymes involved in the nonphosphorylated ED pathway, the EMP pathway lower branch, and the TCA cycle at protein level, the transcription level of these genes changed just slightly, if at all (Figure 5). This indicates that modulations at posttranscriptional level play a major role in glucose metabolism of T. acidophilum. Similar to our finding, posttranscriptional regulation of glycolysis and accelerated glycolytic flux to compensate for the low ATP yield during anaerobic fermentation in Saccharomyces cerevisiae were reported.34 Under anaerobic condition, T. acidophilum uses S0 as a terminal electron acceptor, which has larger red-ox potential than O2, meaning that anaerobic S0 respiration (∆G0′ ) -333 kJ) is less energy-efficient than aerobic respiration (∆G0′ ) -2844 kJ).42 Therefore, the increased glucose degradation in anaerobically cultured T. acidophilum could be a response compensating for the low energy yield of anaerobic S0 respiration. Only the lower branch of the EMP pathway enzymes showed up-regulation under anaerobiosis, whereas the enzymes catalyzing the six-carbon compound transformations in the upper branch of the EMP pathway remained unchanged. The lower branch of the EMP pathway is well conserved and prevalent in the three domains of life, which is not surprising, as it is the key connecting point for pathways involved in the amino acid, pentose-phosphate, and purine precursor generation for the further synthesis of essential biomolecules. In contrast, the enzymes of the upper part vary greatly, especially in archaea and bacteria.43 The different expression levels of upper and lower parts of the EMP pathway suggest that T. acidophilum might utilize alternative enzymes for the six-carbon compound transformations in the EMP pathway. The TCA pathway is important for energy production; therefore, it is not surprising that the complete protein set of the TCA-cycle has been quantified in our proteomics study. It is up-regulated under anaerobic condition, which could be due to the increased consumption of nutrients to compensate for the low energy yield during anaerobic S0 respiration. Besides, the TCA pathway is also essential for intermediary metabolism many intermediates are drawn out of the cycle to be used as precursors in a variety of biosynthetic pathways. Therefore, the increased expression of TCA cycle together with other enzymes

Proteome and Transcriptome Analysis of T. acidophilum

research articles

Figure 4. Schematic representations of the main proteolytic pathways of T. acidophilum. Quantitative transcriptome and proteome data of enzymes involved in these pathways are presented. The left half of the boxes represents the protein expression ratio; the right half represents the mRNA expression ratio. Protein/mRNA levels increased under anaerobiosis are indicated in red; unchanged are indicated in yellow, and decreased under anaerobiosis are shown in blue; a noncolored box means that the protein was either not detected or did not satisfy the quantitation restrictions. Red arrows indicate the metabolic pathway, green arrows indicate the quality control pathway of damaged proteins, and blue arrows indicate the regulatory pathway.

taking part in intermediate metabolism might suggest a higher intermediary metabolic activity under anaerobic condition. Exosomal Superoperon. A superoperon of exosomal genes in archaea has been described that in addition to the predicted exosome components encodes the catalytic subunits of the proteasome, two ribosomal proteins and a DNA-directed RNA

polymerase subunit.44 All the 17 predicted exosomal superoperon-encoded proteins were identified in our study. The proteomics and transcriptomics quantitation data showed that the majority of these genes were not modulated under anaerobiosis. This result indicates the existence of constant/constitutive network of coregulation and functional and physical Journal of Proteome Research • Vol. 9, No. 9, 2010 4845

research articles

Sun et al.

Figure 5. Nonphosphorylated ED pathway, EMP pathway, and TCA cycle of T. acidophilum. The symbol legends of the boxes are the same as in Figure 4.

interactions in a striking range of central cellular functions in T. acidophilum, including translation (most of the ribosomal proteins and proteins involved in translation initiation/elongation factors remained unchanged under anaerobic condition both at protein and mRNA level) and cotranslational protein folding, RNA processing, degradation, modification, and transcription. These observations suggest that the core translational machinery and RNA processing and degradation processes are stably maintained and the overall translation rate is in fact close to equal under aerobic and anaerobic conditions. Surprisingly, key enzymes of protein folding and modification and protein 4846

Journal of Proteome Research • Vol. 9, No. 9, 2010

degradation processes (protein quality control enzymes), like tricorn, thermosome, proteosome, VAT and molecular chaperon proteins, were found to be significantly (2- to 4-fold) upregulated under anaerobic condition at protein level, whereas their transcript expression levels remained unchanged (Figure 4). Currently we do not have a satisfactory explanation for this phenomenon. Proton Transport Pathways. The hypothetical anaerobic proton transport pathway in T. acidophilum was adapted from the proteomics analysis of Thermoplasma volcanium45 while the putative aerobic proton transport was adapted from the KEGG database (http://www.genome.jp/kegg-bin/show_pathway?

Proteome and Transcriptome Analysis of T. acidophilum

research articles

Figure 6. Putative anaerobic proton transport adapted from.45 The symbol legends of the boxes are the same as in Figure 4.

tac00190+Ta0001). According to this database annotation, aerobic proton transports are membrane-associated processes, catalyzed by membrane proteins, whereas the putative anaerobic proton transport pathway is predicted to consist of cytosolic proteins.45 Proteins involved in aerobic proton transport could not be quantified because they are membrane-associated proteins. We found that the NADH dehydrogenase subunits remained unchanged but R, and β subunits (Ta0969m, Ta0970m), which were 2- to 3-fold up-regulated under anaerobic condition. The other putative aerobic electron transport chain members like succinate dehydrogenase (Ta1001-Ta1004) and the putative cytochrome c oxidase homologue (Ta0435) remained unchanged, while the cytochrome bd complex encoding genes (Ta0992, Ta0993) were highly up-regulated under aerobic condition (up to 20-fold), classifying them into the group of genes that changed the most at the mRNA level. The mRNA levels of the A0A1 ATPase complex (Ta0001-Ta0008, Ta0001z) were just slightly up-regulated under aerobic condition. This regulation pattern indicates that part of the aerobic proton/electron transport chain (NADH dehydrogenase, succinate dehydrogenase) remains active during anaerobic respiration (together with the A0A1 ATPase), while the final step of O2 reduction is inactivated, having the cytochrome bd complex down regulated. Several cytosolic enzymes can be assigned to the anaerobic proton transport (Figure 6). We found that their expression levels were massively up-regulated (up to 300-fold) under anaerobic condition. However, Ta0046 and Ta0047, homologues of the sulfhydrogenase subunits of T. volcanium which reduce elemental sulfur to H2S, remained unchanged both at the protein and the mRNA levels. Instead, a predicted cytosolic sulfur respiration enzyme, the sulfide-quinone reductase homologue (Ta1129),40 with a 6.7-fold increase at the protein level under anaerobic condition was identified. Additionally, Schut et al. reported a novel S0-reducing system involving NAD(P)H sulfur oxidoreductase (NSR, PF1186) and a membrane bound oxidoreductase (MBX, PF1441-PF1453) in the hyperthermophilic archaeon Pyrococcus furiosus. NSR and MBX are proposed to be the key enzymes responsible for the reoxidation of ferredoxin and NAD(P)H.46 In T. acidophilum, Ta0837, a homologue to PF1186, was 4-fold up-regulated under anaerobic

condition. Whether T. acidophilum uses Ta0837 and Ta1129 as replacements of Ta0046-47 in sulfur metabolism remains unclear. On the basis of bioinformatic analyses, there is no MBX complex in T. acidophilum; however, 9 of the NADH dehydrogenase subunits (Ta0959, Ta0960, Ta0961, Ta0962, Ta0965, Ta0966, Ta0967m, Ta0968, and Ta0969m) show sequence similarities with MBX subunits (PF1441-PF1448). Together with the detected up-regulation of two NADH dehydrogenase subunits (Ta0969m and Ta0970m) under anaerobic condition, we might speculate that anaerobic NADH dehydrogenase in T. acidophilum has similar functions as MBX and is involved in the S0 reduction process. How T. acidophilum performs insoluble S0 reduction remains elusive, and we urge thorough analyses in the future. It well might be that T. acidophilum uses a new, still undefined mechanism for sulfur respiration. One of the possibilities that soluble quinones that are excreted from the cell take part in extracellular electron transfer between cells and insoluble electron acceptors.47 Whether quinones possess similar functions for S0 respiration in T. acidophilum has not been investigated so far, but the recent finding that T. acidophilum produces soluble quinones and might be excreted from the cell to the environment raises the question (Nagy et al. in preparation). The FixABCX complex (Ta0326-Ta0329) that putatively shuttles electrons to the terminal sulfhydrogenase in T. volcanium was highly upregulated. Interestingly, the FixABCX complex from Rhodospirillum rubrum is membrane bound and shuttles electrons to the anaerobically active and also membrane bound nitrogenase complex that converts N2 to NH3.48 In T. acidophilum, both the FixABCX and the putative S0 reductase (Ta0837) are cytosolic enzymes. Tryptophan Biosynthesis. Transcriptomics data showed that tryptophan biosynthesis is one of the most highly up-regulated processes under aerobic condition. Enzymes (Ta0803m-Ta0808), which catalyze the transformation of chorismate to tryptophan, were more than 50-fold up-regulated under aerobic condition at transcript level. The gene organization of Ta0803m-Ta0808 shows an operon structure.49 The “secondary structure prediction for membrane proteins” analysis indicates that all these proteins are soluble proteins; however, their identification in the proteomics study failed (only Ta0808 was identified with 4 Journal of Proteome Research • Vol. 9, No. 9, 2010 4847

research articles detected peptides). One explanation for their escape from identification in the cytosolic extract is that the proteins encoded by Ta0803m-Ta0808 form a complex and that they are attached/anchored to the membrane via an unknown linker or that there is a hitherto unknown membrane-associating subunit in the complex. Cobalamin (Vitamin B12) Biosynthesis. Despite the weak correlation between the transcriptomics and proteomics data, slight increase of cobalamin (vitamin B12) biosynthesis under aerobic condition is evident in both data sets. Cobalamin is normally involved in DNA synthesis/regulation, fatty acid synthesis, energy production in which processes a group of radical enzymes use it as coenzyme. As radicals irreversibly react with dioxygen, most of these enzymes occur in anaerobic bacteria and archaea. Exceptions are the families of coenzyme B12- and S-adenosylmethionine (SAM)-dependent radical enzymes, of which some members also occur in aerobe organisms.50 Initiation research focusing on radical enzymes in T. acidophilum (like methylmalonyl-CoA, Ta0462-Ta0463 whose expression level was slightly increased at aerobic condition) might reveal why expression level of cobalamin biosynthesis genes is elevated in response to aerobiosis. This increase is very similar to the increase of Ta0059 S-adenosylmethionine (SAM) synthase that produces SAM (a progenitor of a 5′-deoxyadenosyl radical), needed by the group of SAM-dependent radical enzymes. The most abundant protein in aerobic T. acidophilum cells, ribonucleotide reductase (Ta1475), which catalyzes the conversion of purine and pyrimidine nucleotides to deoxynucleotides and provides monomeric precursors essential for DNA replication and repair, needs vitamin B12 to carry out its catalytic functions.51 Up-regulation of vitamin B12 biosynthesis under aerobic condition might indicate a higher turnover of DNA replication and repair when compared to anaerobiosis; however, most of the DNA repair proteins remained unchanged both at protein and mRNA level which is not in agreement with this hypothesis. Coping with Oxygen Stress. The antioxidative enzymes in T. acidophilum are superoxide dismutase (Ta0013), peroxiredoxins (Ta0152, Ta0473, and Ta0954m) and the alkyl hydroperoxide reductase (Ta0125). The transcript level of Ta0013, Ta0125, and Ta0152 was 2- to 10-fold up-regulated under aerobic condition which is in accordance with the high oxygen stress. However, their expression at protein level remained unchanged. Currently we do not have a satisfactory explanation for this phenomenon, however mRNA instability might be an explanation. Proteins of the One Carbon Folate Pool. Five out of the 8 proteins that take part in one-carbon folate pool are 2- to 5-fold up-regulated under anaerobic condition, while the transcription level of these genes shows only slight up-regulation (1.5- to 2-fold). This suggests an enhanced metabolic activity under anaerobiosis because the one-carbon folate pool can be used to gain reducing equivalents by various catabolic reactions and to provide C1 compounds for nucleotide, methionine, and panthotenate biosynthesis.52 Oxidoreductases. The proteomics data showed that 43 of the anaerobically up-regulated proteins possess oxidoreductase activities. The dominant molecular functions connected to these up-regulated oxidoreductases are related to electrontransfer processes through sulfur-related groups. These sulfurrelated electron transfers might play important roles in anaerobic redox reactions and sulfur metabolism; however, these 4848

Journal of Proteome Research • Vol. 9, No. 9, 2010

Sun et al. proteins cannot be assigned to a common biochemical pathwaysthey are distributed over 33 different metabolic processes, involving amino acid, carbohydrate, energy, lipid, nucleotide, cofactor, and vitamin metabolisms. Further detailed investigations of functions of this group of proteins and their stabilities under oxidative stress are needed.

Conclusions As a further means of refining the protein inventory of T. acidophilum, we used state-of-the-art mass spectrometric technologies (a combination of 1D-SDS-PAGE and high sensitive LTQFTICR-MS) which increased the identification to more than 1000 proteins, covering 88% of the annotated cytosolic proteins of this archaeon. Besides protein identification, comparative proteome and transcriptome analysis of aerobically and anaerobically cultured T. acidophilum cells was performed. We found that most of the cytosolic proteins were expressed at a certain level under both conditions, which might mean that T. acidophilum is always prepared for coping with the environmental challenges swiftly. The anaerobic growth conditions induced significant gene regulations (∼30%) both at protein and mRNA levels. Remarkably, the expression level of many of the hypothetical proteins changed dramatically under anaerobiosis, indicating their crucial role in adaptation to environmental challenges. Membrane proteins play vital roles in the communication between the cell and its environment, and mostly membrane proteins that are functionally associated with ATPase activity, endopeptidase activity, ion and amino acid transports were down-regulated under anaerobic conditions at the mRNA level. Moreover, the nonphosphorylated EDpathway, the lower branch of the EMP pathway, and the TCA cycle were up-regulated significantly, indicating accelerated glucose degradation under anaerobic conditions, which might compensate for the low energy yield during anaerobic S0 respiration. Many of the putative anaerobic proton transport proteinsstaking part in S0 reductionswere shown to be cytosolic, complex forming, and highly upregulated under anaerobic conditions. The transcriptome and proteome data show only weak positive correlation, which indicates extensive post-transcriptional regulation mechanisms. On the other hand, a low correlation between the two data sets can originate from the fact that, under aerobiosis or anaerobiosis, proteins can behave differently. Therefore the measurement of in vivo enzyme activities would be an important factor in drawing final conclusions on a system biology level. Integration of transcriptomics, proteomics and metabolomics studies, together with the measurement of in vivo enzyme activities would help us to generate coherent hypotheses and discover new emergent properties that arise from the systemic view. Taken together, quantitative transcriptome and proteome studies on the physiological state of T. acidophilum under aerobic and anaerobic conditions provided us with a detailed and comprehensive overview of the changes in expression levels of mRNAs and proteins. These results can serve as a platform for further studies that will attempt correlating the proteomics data with the cryo-ET data originating from cellular tomograms of aerobically and anaerobically grown T. acidophilum cells. Further studies on the membrane proteome would provide valuable information on the adaptation capability of T. acidophilum to changing environments, as more than 40% of the membrane protein-encoding genes were regulated at mRNA

research articles

Proteome and Transcriptome Analysis of T. acidophilum level in response to anaerobiosis. Another promising field to investigate is the possible post-transcriptional regulation mechanisms that might help us to explain the low correlation between transcriptomics and proteomics data. In addition, the identification and quantitation of hypothetical and/or partially characterized proteins and protein complexes provide a basis for further experiments focusing on the isolation and biochemical and/or structural characterization of previously unknown of hitherto uncharacterized complexes, which will facilitate understanding the life style of this extremophil. Abbreviations: 1D, one-dimensional; 2DE, two-dimensional gel electrophoresis; AU, arbitrary units; BLAST, basic local alignment search tool; cryo-ET, cryo-electron tomography; CV, coefficient of variation; ED pathway, Entner-Doudoroff pathway; EMP pathway, Embden-Meyerhof-Parnas pathway; FTICR, Fourier transform ion cyclotron resonance; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; LTQ, linear quadrupole ion trap; MIPS, Munich information center for protein sequences; MSC, molecular size chromatography; PEDANT, Protein Extraction, Description and ANalysis Tool; TCA cycle, tricarboxylic acid cycle.

(12)

(13)

(14) (15) (16) (17)

(18)

Acknowledgment. We thank Dr. Ju¨rgen Cox for helpful discussions. This work was supported by the “Interaction Proteome”, an Integrated Project funded within the Research Framework Programme 6 (FP6) of the European Commission to develop novel technologies for proteomics research.

Supporting Information Available: Supplementary figure and tables. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Nickell, S.; Kofler, C.; Leis, A. P.; Baumeister, W. A visual approach to proteomics. Nat. Rev. Mol. Cell. Biol. 2006, 7 (3), 225–230. (2) Segerer, A.; Langworthy, T. A.; Stetter, K. O. Thermoplasmaacidophilum and Thermoplasma-volcanium sp-nov from sulfatara fields. Syst. Appl. Microbiol. 1988, 10 (2), 161–171. (3) Ong, S. E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 2002, 1 (5), 376–86. (4) Washburn, M. P.; Ulaszek, R.; Deciu, C.; Schieltz, D. M.; Yates, J. R. 3rd, Analysis of quantitative proteomic data generated via multidimensional protein identification technology. Anal. Chem. 2002, 74 (7), 1650–7. (5) Shevchenko, A.; Chernushevich, I.; Ens, W.; Standing, K. G.; Thomson, B.; Wilm, M.; Mann, M. Rapid ‘de novo’ peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. Rapid Commun. Mass Spectrom. 1997, 11 (9), 1015–24. (6) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 1999, 17 (10), 994–9. (7) Schmidt, A.; Kellermann, J.; Lottspeich, F. A novel strategy for quantitative proteomics using isotope-coded protein labels. Proteomics 2005, 5 (1), 4–15. (8) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D. J. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154–69. (9) Hanke, S.; Besir, H.; Oesterhelt, D.; Mann, M. Absolute SILAC for accurate quantitation of proteins in complex mixtures down to the attomole level. J. Proteome Res. 2008, 7 (3), 1118–30. (10) Mueller, L. N.; Rinner, O.; Schmidt, A.; Letarte, S.; Bodenmiller, B.; Brusniak, M. Y.; Vitek, O.; Aebersold, R.; Muller, M. SuperHirn - a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 2007, 7 (19), 3470–80. (11) Mortensen, P.; Gouw, J. W.; Olsen, J. V.; Ong, S. E.; Rigbolt, K. T.; Bunkenborg, J.; Cox, J.; Foster, L. J.; Heck, A. J.; Blagoev, B.;

(19) (20)

(21) (22) (23)

(24)

(25)

(26)

(27) (28) (29)

(30)

(31)

Andersen, J. S.; Mann, M. MSQuant, an open source platform for mass spectrometry-based quantitative proteomics. J. Proteome Res. XXX, 9 (1), 393–403. Luber, C. A.; Cox, J.; Lauterbach, H.; Fancke, B.; Selbach, M.; Tschopp, J.; Akira, S.; Wiegand, M.; Hochrein, H.; O’Keeffe, M.; Mann, M. Quantitative proteomics reveals subset-specific viral recognition in dendritic cells. Immunity 2010 32 (2), 279-89 (Epub 2010 Feb 18. PubMed PMID: 20171123). Nickell, S.; Beck, F.; Scheres, S.; Korinek, A.; Foerster, F.; Lasker, K.; Mihalache, O.; Sun, N.; Nagy, I.; Sali, A.; Plitzko, J.; Carazo, J. M.; Mann, M.; Baumeister, W. Insights into the molecular architecture of the 26S proteasome. Proc. Natl. Acad. Sci. U.S.A. 2009, 106 (29), 11943–11947. Robb, F. T.; Place, A. R. Archaea: A Laboratory Manual, Thermophiles; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1995; pp 60-61. Sun, N.; Tamura, N.; Tamura, T.; Knispel, R.; Hrabe, T.; Kofler, C.; Nickell, S.; Nagy, I. Size distribution of native cytosolic proteins of Thermoplasma acidophilum. Proteomics 2009, 9 (14), 3783–6. Shevchenko, A.; Tomas, H.; Havlis, J.; Olsen, J. V.; Mann, M. Ingel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 2006, 1 (6), 2856–60. Rappsilber, J.; Ishihama, Y.; Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 2003, 75 (3), 663–70. Pan, C.; Kumar, C.; Bohl, S.; Klingmueller, U.; Mann, M. Comparative proteomic phenotyping of cell lines and primary cells to assess preservation of cell type-specific functions. Mol. Cell. Proteomics 2009, 8 (3), 443–50. Olsen, J. V.; Ong, S. E.; Mann, M. Trypsin cleaves exclusively C-terminal to arginine and lysine residues. Mol. Cell. Proteomics 2004, 3 (6), 608–14. Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteomewide protein quantification. Nat. Biotechnol. 2008, 26 (12), 1367– 72. Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probabilitybased protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551–67. Waanders, L. F.; Chwalek, K.; Monetti, M.; Kumar, C.; Lammert, E.; Mann, M. Quantitative proteomic analysis of single pancreatic islets. Proc. Natl. Acad. Sci. U.S.A. 2009, 106 (45), 18902–7. Altschul, S. F.; Madden, T. L.; Schaffer, A. A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17), 3389–402. Schaffer, A. A.; Aravind, L.; Madden, T. L.; Shavirin, S.; Spouge, J. L.; Wolf, Y. I.; Koonin, E. V.; Altschul, S. F. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 2001, 29 (14), 2994–3005. Marchler-Bauer, A.; Anderson, J. B.; Derbyshire, M. K.; DeWeeseScott, C.; Gonzales, N. R.; Gwadz, M.; Hao, L.; He, S.; Hurwitz, D. I.; Jackson, J. D.; Ke, Z.; Krylov, D.; Lanczycki, C. J.; Liebert, C. A.; Liu, C.; Lu, F.; Lu, S.; Marchler, G. H.; Mullokandov, M.; Song, J. S.; Thanki, N.; Yamashita, R. A.; Yin, J. J.; Zhang, D.; Bryant, S. H. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res. 2007, 35, D237–40. Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, K. F.; Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M.; Hirakawa, M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34, D354–7. Hirokawa, T.; Boon-Chieng, S.; Mitaku, S. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 1998, 14 (4), 378–9. Maere, S.; Heymans, K.; Kuiper, M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 2005, 21 (16), 3448–9. Sun, N.; Beck, F.; Knispel, R. W.; Siedler, F.; Scheffer, B.; Nickell, S.; Baumeister, W.; Nagy, I. Proteomics Analysis of Thermoplasma acidophilum with a Focus on Protein Complexes. Mol. Cell. Proteomics 2007, 6 (3), 492–502. de Godoy, L. M.; Olsen, J. V.; de Souza, G. A.; Li, G.; Mortensen, P.; Mann, M. Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol. 2006, 7 (6), R50. Graumann, J.; Hubner, N. C.; Kim, J. B.; Ko, K.; Moser, M.; Kumar, C.; Cox, J.; Scholer, H.; Mann, M. Stable isotope labeling by amino acids in cell culture (SILAC) and proteome quantitation of mouse

Journal of Proteome Research • Vol. 9, No. 9, 2010 4849

research articles (32) (33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

4850

embryonic stem cells to a depth of 5,111 proteins. Mol. Cell. Proteomics 2008, 7 (4), 672–83. Bonaldi, T.; Straub, T.; Cox, J.; Kumar, C.; Becker, P. B.; Mann, M. Combined use of RNAi and quantitative proteomics to study gene function in Drosophila. Mol. Cell 2008, 31 (5), 762–72. de Godoy, L. M.; Olsen, J. V.; Cox, J.; Nielsen, M. L.; Hubner, N. C.; Frohlich, F.; Walther, T. C.; Mann, M. Comprehensive massspectrometry-based proteome quantification of haploid versus diploid yeast. Nature 2008, 455 (7217), 1251–4. de Groot, M. J.; Daran-Lapujade, P.; van Breukelen, B.; Knijnenburg, T. A.; de Hulster, E. A.; Reinders, M. J.; Pronk, J. T.; Heck, A. J.; Slijper, M. Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post-transcriptional regulation of key cellular processes. Microbiology 2007, 153 (Pt 11), 3864–78. Guell, M.; van Noort, V.; Yus, E.; Chen, W. H.; Leigh-Bell, J.; Michalodimitrakis, K.; Yamada, T.; Arumugam, M.; Doerks, T.; Kuhner, S.; Rode, M.; Suyama, M.; Schmidt, S.; Gavin, A. C.; Bork, P.; Serrano, L. Transcriptome complexity in a genome-reduced bacterium. Science 2009, 326 (5957), 1268–71. Ernst, H. A.; Lo Leggio, L.; Willemoes, M.; Leonard, G.; Blum, P.; Larsen, S. Structure of the Sulfolobus solfataricus alpha-glucosidase: implications for domain conservation and substrate recognition in GH31. J. Mol. Biol. 2006, 358 (4), 1106–24. Kim, S.; Lee, S. B. Identification and characterization of Sulfolobus solfataricus D-gluconate dehydratase: a key enzyme in the nonphosphorylated Entner-Doudoroff pathway. Biochem. J. 2005, 387 (Pt 1), 271–80. Petoukhov, M. V.; Svergun, D. I.; Konarev, P. V.; Ravasio, S.; van den Heuvel, R. H.; Curti, B.; Vanoni, M. A. Quaternary structure of Azospirillum brasilense NADPH-dependent glutamate synthase in solution as revealed by synchrotron radiation x-ray scattering. J. Biol. Chem. 2003, 278 (32), 29933–9. Jensen, P. E.; Gibson, L. C.; Hunter, C. N.; Determinants of catalytic activity with the use of purified, I. D and H subunits of the magnesium protoporphyrin IX chelatase from Synechocystis PCC6803. Biochem. J. 1998, 334 (Pt 2), 335–44. Ruepp, A.; Graml, W.; Santos-Martinez, M. L.; Koretke, K. K.; Volker, C.; Mewes, H. W.; Frishman, D.; Stocker, S.; Lupas, A. N.; Baumeister, W. The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum. Nature 2000, 407 (6803), 508–13.

Journal of Proteome Research • Vol. 9, No. 9, 2010

Sun et al. (41) Budgen, N.; Danson, M. J. Metabolism of glucose via a modified Entner-Doudoroff pathway in the thermoacidophilic archaebacterium Thermoplasma acidophilum. FEBS Lett. 1986, 196 (2), 207– 210. (42) Madigan, M.; Martinko, J.; Parker, J. Biology of Microorganisms; Sourthern Illinois University Carbondale: Carbondale, IL, 2000; p 592. (43) Ronimus, R. S.; Morgan, H. W. Distribution and phylogenies of enzymes of the Embden-Meyerhof-Parnas pathway from archaea and hyperthermophilic bacteria support a gluconeogenic origin of metabolism. Archaea 2003, 1 (3), 199–221. (44) Koonin, E. V.; Wolf, Y. I.; Aravind, L. Prediction of the archaeal exosome and its connections with the proteasome and the translation and transcription machineries by a comparativegenomic approach. Genome Res. 2001, 11 (2), 240–52. (45) Kawashima, T.; Yokoyama, K.; Higuchi, S.; Suzuki, M. Identification of proteins present in the archaeon Thermoplasma volcanium cultured in aerobic or anaerobic conditions. Proc. Jpn. Acad. 2005, 81, 204–19. (46) Schut, G. J.; Bridger, S. L.; Adams, M. W. Insights into the metabolism of elemental sulfur by the hyperthermophilic archaeon Pyrococcus furiosus: characterization of a coenzyme A- dependent NAD(P)H sulfur oxidoreductase. J. Bacteriol. 2007, 189 (12), 4431– 41. (47) Newman, D. K.; Kolter, R. A role for excreted quinones in extracellular electron transfer. Nature 2000, 405 (6782), 94–7. (48) Edgren, T.; Nordlund, S. The fixABCX genes in Rhodospirillum rubrum encode a putative membrane complex participating in electron transfer to nitrogenase. J. Bacteriol. 2004, 186 (7), 2052– 60. (49) Xie, G.; Keyhani, N. O.; Bonner, C. A.; Jensen, R. A. Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol. Mol. Biol. Rev. 2003, 67 (3), 303–42; table of contents. (50) Buckel, W.; Golding, B. T. Radical enzymes in anaerobes. Annu. Rev. Microbiol. 2006, 60, 27–49. (51) Stubbe, J.; Ge, J.; Yee, C. S. The evolution of ribonucleotide reduction revisited. Trends Biochem. Sci. 2001, 26 (2), 93–9. (52) Maden, B. E. Tetrahydrofolate and tetrahydromethanopterin compared: functionally distinct carriers in C1 metabolism. Biochem. J. 2000, 350 (Pt 3), 609–29.

PR100567U