Biology of the Cold Adapted Archaeon - American Chemical Society

The University of New South Wales, Sydney, 2052, NSW, Australia. Received June 23, 2004. Genome sequence data of the cold-adapted archaeon, ...
0 downloads 0 Views 551KB Size
Biology of the Cold Adapted Archaeon, Methanococcoides burtonii Determined by Proteomics Using Liquid Chromatography-Tandem Mass Spectrometry Amber Goodchild,† Mark Raftery,‡ Neil F. W. Saunders,† Michael Guilhaus,‡ and Ricardo Cavicchioli*,† School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, 2052, NSW, Australia and Bioanalytical Mass Spectrometry Facility, The University of New South Wales, Sydney, 2052, NSW, Australia Received June 23, 2004

Genome sequence data of the cold-adapted archaeon, Methanococcoides burtonii, was linked to liquid chromatography-mass spectrometry analysis of the expressed-proteome to define the key biological processes functioning at 4 °C. 528 proteins ranging in pI from 3.5 to 13.2, and 3.5-230 kDa, were identified. 133 identities were for hypothetical proteins, and the analysis of these is described separately (Goodchild et al. manuscript in preparation). DNA replication and cell division involves eucaryotic-like histone and MC1-family DNA binding proteins, and 2 bacterial-like FtsZ proteins. Eucaryotic-like, core RNA polymerase machinery, a bacterial-like antiterminator, and numerous bacterial-like regulators enable transcription. Motility involves flagella synthesis regulated by a bacterial-like chemotaxis system. LsmR and Lsmγ were coexpressed raising the possibility of homo- and hetero-oligomeric complexes functioning in RNA processing. Expression of FKBP-type and cyclophilin-type peptidyl-prolyl cis-trans isomerases highlights the importance of protein folding, and novel characteristics of folding in the cold. Thirteen proteins from a superoperon system encoding proteasome and exosome subunits were expressed, supporting the functional interaction of transcription and translation pathways in archaea. Proteins involved in every step of methylotropic methanogenesis were identified. CO2 appears to be fixed by a modified Calvin cycle, and by carbon monoxide dehydrogenase. Biosynthesis involves acetylCoA conversion to pyruvate by a non-oxidative pentose phosphate pathway, and gluconeogenesis for the conversion of pyruvate to carbohydrates. An incomplete TCA cycle may supply biosynthetic intermediates for amino acid biosynthesis. A novel finding was the expression of Tn11- and Tn12family transposases, which has implications for genetic diversity and fitness of natural populations. Characteristics of the fundamental cellular processes inferred from the expressed-proteome highlight the evolutionary and functional complexity existing in this domain of life. Keywords: proteome • LC/LC-MS/MS • archaea • methanogen • psychrophile • relative synonymous codon usage • proteasome/exosome superoperon • archaeal biology • transposon expression

Introduction Insight into the biology of Archaea has been rapidly advanced through genome sequencing. Presently, 19 complete genome sequences and at least 27 draft genomes at various stages of completion are available. Genomic studies on coldadapted methanogens, Methanococcoides burtonii and Methanogenium frigidum, have enabled a comparison of archaeal genomes with species capable of growth at temperatures ranging from 0 °C to 110 °C.1 The study highlighted structural and compositional features of proteins and tRNA that were characteristic of growth temperature, and identified novel, * To whom correspondence should be addressed. Tel: +61-2-93853516. Fax: +61-2-93852742. E-mail: [email protected]. † School of Biotechnology and Biomolecular Sciences, The University of New South Wales. ‡ Bioanalytical Mass Spectrometry Facility, The University of New South Wales.

1164

Journal of Proteome Research 2004, 3, 1164-1176

Published on Web 10/20/2004

putative nucleic-acid-binding proteins present only in the coldadapted Archaea. The availability of the draft genome sequence for M. burtonii has greatly facilitated global2 and targetted3 functional studies. The comparative genomics analysis highlighted the limitations that GC content would place on the flexibility of tRNA from M. burtonii, and proposed that a putative dihydrouridine synthase may facilitate the incorporation of dihydrouridine to increase tRNA flexibility.1 This was experimentally examined, and dihydrouridine was found to be present in M. burtonii at levels higher than in any other Archaea previously examined.3 Two-dimensional electrophoresis (2DE) was developed to examine differential expression for cells growing at 4 °C and 23 °C.2 Forty-three of a total of 54 spots with differential spot intensities were identified by mass spectrometry (MS). Coldadaptation was shown to involve proteins important in transcription, protein folding and metabolism, and regulation of 10.1021/pr0498988 CCC: $27.50

 2004 American Chemical Society

Proteome of M. burtonii by LC/LC-MS/MS

gene expression shown to involve posttranslational modification (PTM), expression of genes in operons and the incorporation of pyrrolysine in a trimethylamine methyltransferase. To fully exploit the draft genome sequence and obtain a comprehensive view of the biology of the cell, in this study we have determined the expressed proteome for M. burtonii growing at 4 °C. In Archaea, the majority of proteomics studies have been performed on the hyperthermophile, Methanocaldococcus jannaschii which grows at 85 °C.4 This is due to the availability, since 1996, of a complete genome sequence,5 and the inherent stability of it’s proteins. Studies canvassing large numbers of M. jannaschii proteins include the identification of 170 of the most abundant 2DE spots using a combination of in-gel digestion and MALDI-TOF MS or LCMS/MS6, and 72 proteins with 100% sequence coverage using a combination of pre-fractionation and LC-MS/MS.7 Improvements in multidimensional liquid chromatography coupled to tandem mass spectrometry7-10 and sample preparation11,12 have enabled proteins to be analyzed from a range of complex systems. We have used a combined LC-MS/MS and LC/LCMS/MS approach to identify 528 M. burtonii proteins. This has not only enabled a greater understanding of the biology of the organism to be derived, but has illustrated the capacity to perform high-throughput analyses of proteins from coldadapted organisms.

Experimental Procedures Organisms and Culture Conditions. M. burtonii was grown in liquid modified methanogen growth media (MFM) under anaerobic conditions in a gas phase of 80:20 N2:CO213 at 4 °C. Culture innocula were taken from actively growing batches of cells at their respective temperatures, and cells were passaged at least once prior to harvesting for biomass. The cultures were harvested at late logarithmic phase (absorbance 0.25 at 600 nm) by centrifugation at 2800 g for 25 min at 4 °C. Sample Preparation. Cell pellets from 50 mL cultures were resuspended in 1.0 mL of Tris-HCl pH 8.0 and disrupted on ice by sonication with a Branson Sonifier for 4 cycles of 30 s on a 30% duty cycle and a power setting of 3. Cell debris was removed by centrifugation at 10 000 g for 25 min at 4 °C and the supernatant containing 1 mg protein was transferred to a 1.0 mL spin dialysis unit (Millipore). The protein samples were dialyzed against Tris-HCl pH 8.0 to remove excess salt and digested using trypsin or chymotrypsin (1:100 enzyme:protein) in 10mM NH4HCO3 at 37 °C for 14 h. Proteins were solubilized using formic acid (100 µL, 25%) and loaded onto a C18 RP-column (Agilent, C18, Zorbax 300SB, 3.5 µm, 3 × 150 mm) and eluted with a linear gradient of 25% to 75% acetonitrile (0.1% TFA) at 0.4 mL/min over 30 min. Fractions were collected every min from 12 to 26 min. Each fraction (150 µL) was lyophilised then digested using trypsin (2.5 µg) in ammonium bicarbonate (50 µL, 20 mM) for 14 h at 37 °C. Peptides were analyzed by nano LC-MS or automated LC/LC-MS as described. Following fractionation 100 µL of each fraction was dried in vacuuo, resuspended in 10 µL of MilliQ, denatured at 80 °C for 5 min and loaded onto a 12% acrylamide gel containing SDS. Following electrophoresis in Tris-Glycine-SDS (0.3% w/v Tris, 1.8% w/v glycine, 0.1% w/v SDS) running buffer, the gels were silver stained and visualized using a UMAX Powerlook 1000 scanner. Liquid Chromatography and Mass Spectrometry. Digested peptides were separated by online cation exchange (SCX) and

research articles nano C18 LC using an Ultimate HPLC, Switchos and Famos autosampler system (LC-Packings, Amsterdam, Netherlands). Peptides (∼500 ng) were dissolved in formic acid (0.1%, 25 µL) and loaded onto a SCX micro trap (1 × 8 mm, Michrom Bioresources, Auburn, CA). Peptides were eluted sequentially using 5, 10, 15, 20, 25, 30, 40, 50, 75, 150, 300, and 1000 mM ammonium acetate (20 µL). The unbound load fraction and each salt step were concentrated and desalted onto a micro C18 precolumn (500 µm × 2 mm, Michrom Bioresources) with H2O:CH3CN (98:2, 0.1% formic acid, buffer A) at 20 µL/min. After a 10 min wash the precolumn was switched (Switchos) into line with an analytical column containing C18 RP silica (PEPMAP, 75 µm × 15 cm, LC-Packings) or a fritless C18 column (75 µm × ∼12 cm14). Peptides were eluted using a linear gradient of buffer A to H2O:CH3CN (40:60, 0.1% formic acidbuffer B) at 200 nL/min over 60 min. The column was connected via a fused silica capillary to a low volume tee (Upchurch Scientific) where high voltage (2300 V) was applied and a nano electrospray needle (New Objective) or fritless column outlet was positioned ∼1 cm from the orifice of an API QStar Pulsar i hybrid tandem mass spectrometer (Applied Biosystems, Foster City, CA). Positive ions were generated by electrospray and the QStar operated in information dependent acquisition mode (IDA). A TOF-MS survey scan was acquired (m/z 350-1700, 0.75 s) and the 2 largest multiply charged ions (counts > 20, charge state g2 and e4) sequentially selected by Q1 for MS/MS analysis. Nitrogen was used as collision gas and an optimum collision energy chosen (based on charge state and mass). Tandem mass spectra were accumulated for 2 s (m/z 65-2000). MS/MS Data Analysis. Processing scripts generated data suitable for submission to the database search programs (Mascot, Matrix Science or SEQUEST). Extracted spectra were also analyzed using DTASelect to simplify interpretation.15 All MS/MS spectra were searched against a local database of M. burtonii translated sequences obtained from http://www. jgi.doe.gov/JGI_microbial/html/using SEQUEST and Mascot. A total of 39 243 MS/MS spectra were searched against the database. Spectra which satisfied the following parameters were subjected to manual inspection in order to make an identification: strict trypsin enzyme digestion or chymotrypsin digestion, peptide mass tolerance of 3 for [M+3H]3+ or P-score indicating identity (Mascot), deltaCN < 0.8 (SEQUEST) and rank #1 (Mascot). Genome Analysis. Gene models were generated from the draft genome assembly by the ORNL Computational Biology division using the programs Generation, Glimmer and Critica. Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation. The molecular weight (MW) and isoelectric point (pI) for each protein was predicted using EMBOSS. Transmembrane helices were predicted using TMHMM 2.0. The codon usage analysis was performed using the GCUA package and genes were plotted at their coordinates on the first two axes produced by the analysis.16 and sequence coverage was calculated using SEQUEST. Gene clusters were visualized using Generic genome browser version 1.59. PRIAM17 was used to detect enzymes in the M. burtonii draft genome sequence that were present in the ENZYME database. Journal of Proteome Research • Vol. 3, No. 6, 2004 1165

research articles

Goodchild et al.

Figure 1. Identification of proteins in total and fractionated protein samples using LC-MS/MS and LC/LC-MS/MS. Total M. burtonii proteins were separated by reverse phase chromatography with a C18 column, and each fraction was analyzed by SDS-PAGE, prior to LC-MS/MS analysis (Table 1). Numbers above the gel are fraction numbers. UF, unfractionated sample; MW, Molecular weight protein standard (kDa): 224, 116, 96, 66, 51.5, 35.3, 28.7, and 21.

Results and Discussion Identifying Proteins. In excess of 39 000 MS/MS spectra were generated from 3 separate extracts of M. burtonii grown at 4 °C. 2 of the extracts were analyzed by 4 separate LC/LCMS/MS runs. The third extract was separated by C18 chromotography and analyzed using a combination of LC-MS/MS and LC/LC-MS/MS (Figure 1, Table 1). A limitation of analyzing complex mixtures of peptides is peptide coelution, and generally the 2 most intense peptides are selected for MS/MS. To reduce coelution, the total protein extract was prefractionated using a C18 column. Four-hundred forty-seven proteins were identified from the total protein extract using LC/LC-MS/MS (Table 1). An additional 79 proteins were identified by LCMS/MS from a total of 21 pre-fractionation fractions, and a further 40 proteins were identified by analyzing pooled fractions 12-14, using LC/LC-MS/MS (Figure 1, Table 1). The total protein sample was digested with trypsin, chymotrypsin, and a mixture of the two enzymes. The combination of digestion regimes increased the coverage of identifications of low abundance proteins. The sequence coverage of individual proteins ranged from 0.7 to 94% (Figure 2a). Twohundred one proteins (38%) were identified from a match to a single peptide. All identifications met strict criteria, including a high SEQUEST or Mascot score (see Materials and Methods), a monoisotopic mass of precursor and product ions of less than 0.15 Da, and verification of all tandem mass spectra by visual inspection. After data-filtering to eliminate low-scoring spectra and manual verification of spectra, 528 proteins were identified. This represents approximately 19% of the 2782 predicted proteins from the M. burtonii draft genome sequence. Threehundred seventy-nine proteins were identified by both SEQUEST and Mascot. An additional 66 proteins were only identified by Mascot, and 83 proteins were identified only by SEQUEST. With the exception of RNA polymerase (RNAP) subunit E, all proteins previously identified on 2DE gels2 were identified in this study. Similar methods of protein solubilization were used in both studies. The silver-stained protein-spot for RNAP E was low intensity and multiple gels were pooled to obtain sufficient protein for identification.2 Low protein abundance may explain why it was not identified in the present work, despite the identification of all other components of the RNAP complex (see below). General Characteristics of Expressed Proteins. The pI and MW of the 2782 proteins from the M. burtonii predicted1166

Journal of Proteome Research • Vol. 3, No. 6, 2004

Table 1. Number of Proteins Identified by LC-MS/MS and LC/LC-MS/MSa run/ fraction no.

whole sample LC/LC-MS/MS

fractionated sample LC-MS/MS

LC/LC-MS/MS

1 2 3 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 12-14

total no. IDs

accumulated IDs

IDs unique to sample

278 245 131 84

278 362 400 447

278 84 47 38

1 0 1 0 1 4 3 18 30 39 52 55 38 20 14 14 0 0 0 0 0 173

447 447 447 447 447 448 448 455 465 469 483 485 487 487 488 488 488 488 488 488 488 528

0 0 0 0 0 1 0 7 10 4 14 2 2 0 1 0 0 0 0 0 0 40

a

Protein identifications (IDs) from 4 separate LC/LC-MS/MS runs of whole cell extracts, LC-MS/MS runs of 21 prefractionated samples (Figure 1), and a LC/LC-MS/MS run of pooled fractions 12-14 (Figure 1).

proteome were compared with the 528 proteins identified from the expressed-proteome (Figure 2b). The proteins identified covered a broad range of predicted pIs and MW, and the distribution of proteins was similar in the predicted- and expressed-proteomes. The most basic and most acidic proteins identified had pIs of 13.2 and 3.5, respectively. The largest protein was approximately 230 kDa and the smallest 3.5 kDa. 7 proteins from the predicted proteome were more acidic than the most acidic protein identified in the expressed-proteome. Twenty-three proteins in the predicted proteome were smaller than the smallest protein identified in the expressed-proteome. Approximately 9% of the identified proteins were predicted to contain between 1 and 16 transmembrane domains (TMDs)

Proteome of M. burtonii by LC/LC-MS/MS

research articles

Figure 3. Correspondence analysis of codon usage frequency variation for genes from M. burtonii. (A) predicted proteome (grey, open circles) and expressed-proteome (black, solid circles); (B) predicted proteome (grey, open circles), ribosomal proteins (pink, closed triangles) and transcriptional regulators (blue, closed triangles); (C) expressed-proteome (black, solid circles), proteins identified in all 5 LC/LC-MS/MS runs (pink, closed triangles) and 2 transposases (blue, closed triangles).

Figure 2. (A) Percentage of protein sequence obtained for proteins identified by LC-MS. (B) Virtual 2D gel of the M. burtonii proteome. Predicted MW and pI for 2782 predicted proteins (grey, open circles) and 528 identified proteins from the expressedproteome (closed, black circles). Boxed region shows coverage by 2DE from previous studies with pI-strips 4-72 and 6-11 (Goodchild et al. unpublished results). (C) Number of predicted transmembrane domains in identified proteins.

(Figure 2c). 14 matched hypothetical proteins and 33 matched proteins with known functions, including flagellin proteins, transport proteins, membrane-bound enzymes and a histidine kinase member of a two-component regulatory system (2CRS). In addition to TMDs, 26 proteins were predicted to encode signal sequences, indicating they are likely to be targeted for export from the cytoplasm. Signal peptide prediction for archaeal proteins is less advanced than for bacterial or eukaryotic proteins, indicating this number may be somewhat underrepresented. The signal peptide-containing proteins included transport proteins, flagellins, a sensory histidine kinase, and a number of hypothetical proteins. Codon usage can be a useful indicator of gene expression levels.16,18-19 To examine biases in codon usage for the genes responsible for the expressed-proteome, the relative synony-

mous codon usage (RSCU) was calculated for all genes from the M. burtonii genome data, and the pattern of codon usage was analysed by correspondence analysis16 (Figure 3). The 528 expressed proteins were present throughout the plot with a bias toward the right-hand side (Figure 3a). Subsets of proteins assumed to be present at high (ribosomal proteins) or low (transcriptional regulators) cellular levels, were compared (Figure 3b). The two sets were separated with the high abundance proteins distributed on the right-hand side of the plot. This indicates that even though proteins with low abundance were identified (left-hand side Figure 3a), the expressed-proteome consists of a disproportionately high number of the abundant proteins (right-hand side, Figure 3a). This is likely to reflect the relative ease of separating and analyzing the more abundant proteins. It may also be an indication that the genes with a preferred codon usage are more commonly expressed in the cell under the growth conditions tested. To discriminate between the two possibilities, the distribution of proteins which appeared in all of the LC/LCMS/MS runs (and were therefore abundant proteins), were plotted (Figure 3c). The distribution bias of abundant proteins to the right-hand side was a strong indication that the likelihood of identifying a protein in the expressed-proteome was higher if the protein was abundant. Journal of Proteome Research • Vol. 3, No. 6, 2004 1167

research articles

Goodchild et al.

Table 2. DNA Replication, DNA Processing, Cell Division, and Transposase Proteins contig_gene no.a

64_1151 65_1320 70_2300 70_2448 70_2510 61_830 62_1006 70_2392 64_1167 70_2292 65_1250 70_2346 58_669 63_1044 70_2561 61_900 70_2394 58_664 69_2089 70_2719 65_1306 70_2683 69_1999 69_2023 66_1357 65_1216 69_1993 36_12 70_2309

functional category

DNA replication and processing DNA primase DNA-directed DNA polymerase, subunit 2 DNA-polymerase sliding clamp (DnaN) DNA topoisomerase subunit A DNA topoisomerase, type I archaeal histone putative DNA-binding protein purine NTPase replication factor-A protein replication factor- C subunit DNA repair protein RadA DNA repair helicase Rad3 DNA helicase RecQ transposases transposase transposase cell division and chromosome partitioning cell division protein pelota chromosome partition protein chromosomal protein MC1 cell division protein FtsZ cell division protein FtsZ cell division control protein 48 AAA family protein nucleotide binding protein defense mechanisms type I site-specific deoxyribonuclease type RM system methylation subunit type I site-specific RM system metallo-β-lactamase catalase universal stress protein universal stress protein UspA

a Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation.

Inferring Cell Biology from Gene Function and Genome Organization. 391 proteins were identified that could be linked to distinct biological processes (Tables 2-6), and the physical organization of genes provided insight into associated gene partners, and gene regulation (e.g., putative operons). The physical arrangement of every expressed-gene relative to neighboring genes is in Supporting Information (Figure S1). The biological processes have been color coded and illustrated in a cartoon of the cell (Figure 4), with the colors of the genes matched to their gene organization (Figure S1). 133 identities were for hypothetical proteins, and the analysis of these is described separately (Goodchild et al. manuscript in preparation). DNA Replication and Cell Division. The DNA replication machinery in archaea more closely resembles that of eucarya than bacteria, and the precise mechanisms of the replication processes, particularly initiation, are being actively investigated.20,21 Genome segregation and cell division processes are similarly beginning to be defined, and distinctions between euryarchaeota and crenarchaetoa are emerging.22 Euryarchaeota encode multiple homologues of the bacterial ftsZ gene, whereas they are absent in the genome sequences of crenarchaetoa.22 M. burtonii encodes two ftsZ genes, both of which were identified in the expressed-proteome (Table 2). The fact that cells were harvested during steady-state growth indicates that both copies are required during the normal cell cyle, rather than being required for a specific physiological state, such as entry into stationary phase or response to stress. 1168

Journal of Proteome Research • Vol. 3, No. 6, 2004

Table 3. Proteins Involved in Transcription, Signal Transduction, and Motilitya contig_gene no.a

59_730* 70_2299 53_393 70_2397 69_2091 40_28 51_299 51_300 49_247 60_792 56_533 67_1558 69_2063 57_579 66_1386 66_1387 66_1384 66_1385 58_659 66_1382 48_213 61_866* 48_212 56_562* 69_1955 62_982* 64_1163 56_522 70_2310 61_899 70_2593 56_559* 67_1596 56_557* 69_2126 47_167 68_1785 69_2125 59_736 70_2500 70_2354 54_441 64_1162 69_2184 57_629 57_630 57_632 57_620 57_619 62_1015 70_2468

functional category

RNA synthesis and processing predicted transcription factor MBF1 archaeal transcription factor S NAC R-BTF3 transcription factor TATA-box binding protein transcription antiterminator protein NusG transcriptional regulator, CopG family transcriptional regulator, ArsR family transcriptional regulator, Crp family transcriptional regulator, MarR family transcriptional regulator, AsnC family predicted transcriptional regulator predicted transcripitional regulator DEAD-box RNA helicase DNA-directed RNA polymerase I, II and III 7.3kDa polypeptide DNA-directed RNA polymerase subunit A DNA-directed RNA polymerase subunit A′′ DNA-directed RNA polymerase subunit B DNA-directed RNA polymerase subunit B′ DNA-directed RNA polymerase subunit D DNA-directed RNA polymerase subunit H DNA-directed RNA polymerase subunit K DNA-directed RNA polymerase subunit L DNA-directed RNA polymerase subunit N DNA-directed RNA polymerase subunit P DNA-directed RNA polymerase, subunit F cleavage and polyadenylation specificity factor archeosine tRNA-ribosyltransferase small nuclear ribonucleoprotein Lsmγ small nuclear ribonucleoprotein LsmR predicted RNA-binding protein RNase P subunit RPR2 RNase PH RNase L inhibitor predicted exosome subunit signal transduction sensor histidine kinase sensor histidine kinase sensor histidine kinase two-component sensor response regulator response regulator response regulator two-component hybrid sensor and regulator two-component hybrid sensor and regulator two-component hybrid sensor and regulator motility methylaccepting chemotaxis protein chemotaxis signal transduction protein CheW chemotaxis response regulator CheY flagellin flagellin B1 precursor flagellin B2 precursor putative flagella related protein J

a Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation. *Members of the superoperon and associated exosome/proteasome genes (Figure 5) are marked with an asterisk.

Genomic DNA is organized into nucleosomes in euryarchaetoa by eucaryotic-like histone proteins and other DNA binding proteins including members of the MC1 family.23 In eucaryotes, histones are subject to PTM, primarily through N- and Cterminal extensions of the histone core; regions which appear to be absent in archaeal proteins.23 M. burtonii encodes a histone acetyltransferase (69_2122), which could function to acetylate histones. The histone acetyltransferase was not detected in the expressed-proteome, and no evidence of acetylation of the histone could be detected in the MS-spectra.

research articles

Proteome of M. burtonii by LC/LC-MS/MS Table 4. Proteins Involved in Translation and Protein Foldinga contig_gene no.a

69_2093 46_134 46_135 70_2210 69_2094 61_852 69_2092 69_2095 48_210 34_10 34_7 48_209 69_1954 46_136 34_9 66_1388 62_1009 34_5 56_561* 62_1008 63_1053 61_835 62_1011 48_214 53_409 58_657 34_8 70_2215 66_1391 60_791 48_211 62_1012 56_565* 56_556* 62_981* 55_495 55_496 55_497 70_2297 70_2681 68_1682 69_2100 70_2255

functional category

contig_gene no.a

functional category

protein synthesis and processing LSU ribosomal protein L1P 56_532 SSU ribosomal protein S10P LSU ribosomal protein L3P 66_1394 SSU ribosomal protein S10P LSU ribosomal protein L4P 58_658 SSU ribosomal protein S11P LSU ribosomal protein L7AE 66_1390 SSU ribosomal protein S12P LSU ribosomal protein LPO 58_656 SSU ribosomal protein S13P LSU ribosomal protein L10AE 53_400 SSU ribosomal protein S15P LSU ribosomal protein L11P 62_961 SSU ribosomal protein S17E LSU ribosomal protein L12AE 62_1005 SSU ribosomal protein S19E LSU ribosomal protein L13P 69_1919 SSU ribosomal protein S24E LSU ribosomal protein L15P 61_836 SSU ribosomal protein S27E LSU ribosomal protein L18P 70_2211 SSU ribosomal protein S28E LSU ribosomal protein L18E 45_97 ribosomal S6 modification protein LSU ribosomal protein L21E 50_257 ribosomal S6 modification protein LSU ribosomal protein L23P 70_2305 alanyl-tRNA synthetase LSU ribosomal protein L30P 70_2420 tyrosyl-tRNA synthetase LSU ribosomal protein L30E 64_1129 arginyl-tRNA synthetase LSU ribosomal protein L31E 56_515 isoleucyl-tRNA synthetase LSU ribosomal protein L32E 67_1526 lysyl-tRNA synthetase LSU ribosomal protein L37AE 69_2066 methionyl-tRNA synthetase LSU ribosomal protein L39E 55_489 threonyl-tRNA synthetase LSU ribosomal protein L40E 63_1050 glutamyl-tRNA (Gln) amidotransferase subunit C LSU ribosomal protein L44E 63_1051 glutamyl-tRNA (Gln) amidotransferase LSU ribosomal protein LX 56_560* tRNA nucleotidyltransferase Rrp42 SSU ribosomal protein S2P 69_1914 translation initiation factor 2 γ subunit SSU ribosomal protein S3AE 67_1504 translation initiation factor 5A SSU ribosomal protein S4P 70_2214 translation initiation factor 2 SSU ribosomal protein S5P 60_826 protein translation intiation factor 1 SSU ribosomal protein S6E 66_1393 translation elongation factor 1A SSU ribsomal protein S7P 55_473 translation elongation factor 1, subunit β SSU ribosomal protein S8E 66_1392 translation elongation factor 2 SSU ribosomal protein S9P 61_901 2′-5′ RNA ligase post-translational modification, protein degradation and chaperones prefoldin, subunit R 70_2402 cyclophilin-type peptidyl-prolyl cis-trans isomerase prefoldin, subunit β 67_1503 metallopeptidase 20S proteosome, R subunit 69_2181 ATP-dependent protease proteasome β subunit precursor 69_2182 PmbA/TldD family protein molecular chaperone GrpE 47_168 alkyl hydroperoxide reductase chaperone protein DnaK 67_1523 deoxyhypusine synthase chaperone protein DnaJ 59_747 glutaredoxin-like protein ClpB protein 69_2007 pyruvate-formate lyase-activating enzyme Hsp60 67_1593 thioredoxin Hsp60 67_1542 stomatin-like protein Hsp60 67_1555 hydrogenase maturation factor FKBP-type peptidyl-prolyl cis-trans isomerase

a Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation. *Members of the superoperon and associated exosome/ proteasome genes (Figure 5) are marked with an asterisk.

This is consistent with a lack of PTMs in histones from both M. jannaschii and Methanosarcina acetivorans, despite the presence of histone modifying genes in M. acetivorans, and supports the view that histone modifications are not required for their function in methanogens.7 The expression of MC1 is also an indication that both classes of nucleoid proteins are active during the cell cycle. DNA Repair, Transposons and Fitness. DNA repair processes in archaea appear to have a mixed ancestory with homologues present in bacteria and eucarya, and with major differences between crenarchaeota and euryarchaeota.24 M. burtonii expressed eukaryotic-like DNA recombination proteins RadA and Rad3 and the bacterial-like protein, RecQ. A hypothetical protein with sequence characteristics of a bacteriallike RecJ was also identified (Goodchild et al. manuscript in preparation). These may be good candidates for investigating DNA processing at low temperature; an area of cold adaptation biology that has not been addressed.

An interesting finding was the expression of 2 transposase (Tn) genes. 65_1044 is a member of the Tn11 family which includes IS4, IS421, IS5377, IS427, IS1355, and IS5, and similar genes were found only in archaea (methanogens and Sulfolobus solfataricus). 70_2561 is in the Tn12 family (IS204, IS1001, IS1096, IS1165), and similar genes were found in M. acetivorans and a range of bacteria (Figure 5a). 65_1044 is clustered near 4 other Tn genes, and at least 9 small ORFs (less than 300 bp) which have sequences unique to M. burtonii. 70_2561 was located near 3 Tn genes and 6 hypothetical genes (Figure 5a). One of the hypothetical proteins (70_2564) had a Pfam match (PF02517) which included genes encoded on a conjugative plasmid, and bacteriocin-like peptides. The GC content for the 2 transposase ORFs was 31-32%, compared with 41% for the draft genome. These characteristics are indicative of gene capture events mediated by transposons following gene transfer, presumably from a bacterial host. 82 Tn genes are predicted in the M. burtoniii draft genome, compared with approximately Journal of Proteome Research • Vol. 3, No. 6, 2004 1169

research articles

Goodchild et al.

Table 5. Proteins Involved in Transport, Methanogenesis, and Energy Production contig_gene no.a

65_1304 70_2267 70_2340 67_1512 70_2289 65_1312 58_686 70_2654 70_2764 69_2016 69_2017 60_820 66_1439 67_1616 62_990 70_2422 63_1031 55_499 69_1942 69_1943 69_1944 69_1945 61_917 61_918 61_919 51_303 51_306 51_307 51_308 69_1921 69_1922 69_1923 69_1924 61_916 68_1755 68_1779 68_1780 68_1781 68_1787 68_1789 60_801 65_1325 65_1322 65_1323 66_1477 66_1478 66_1475 66_1474 66_1476 66_1471 66_1472 66_1473 53_416 52_374 61_904 68_1828 a

functional category

contig_gene no.a

functional category

cell envelope UDP-glucose 4-epimerase 62_943 putative cell wall biosynthesis regulatory protein glucose-1-phosphate thymidylyltransferase 68_1660 surface antigen gene UTP glucose-1-phosphate uridylyltransferase 48_170 myo-inositol-1-phosphate synthase sugar phosphate isomerase 62_944 myo-inositol-2 dehydrogenase L-fuculose-phosphate aldolase 69_2077 acyl carrier protein synthase N-acetyl-D-galactosaminuronic acid dehydrogenase 69_2078 acetoacetyl-CoA thiolase UDP-N-acetyl-D-mannosaminuronate dehydrogenase 65_1217 geranylgeranyl pyrophosphate synthase putative cell wall biosynthesis regulatory protein 66_1469 geranyltransferase transport type II secretion system protein (GspE-3) 69_2097 sodium:proline transporter ABC transporter, ATP binding protein 59_712 Fe2+ transport system protein A ABC transporter, ATP binding protein 59_711 ferrous iron transport protein A ABC transporter, ATP-binding protein 52_366 P-type copper transporting ATPase ABC transporter, ATP-binding protein 69_2104 cobalt transport protein ABC transporter, ATP-binding protein 68_1709 zinc ABC transporter solute binding lipoprotein probable TolB-related transport protein 70_2318 molybdenum transport protein ModA K+ transport system membrane component 52_365 mercury ion binding protein cation-transporting ATPase 54_423 predicted permease sodium transport protein 62_996 signal recognition particle methanogenesis trimethylamine methyltransferase 65_1321 methyl coenzyme M reductase, subunit β trimethylamine methyltransferase MttB 65_1324 methyl coenzyme M reductase, subunit γ trimethylamine corrinoid protein 65_1236 heterodisulfide reductase, subunit D trimethylamine permease 70_2379 tetrahydromethanopterin S-methyltransferase, subunit A trimethylamine methyltransferase MttB2 70_2378 tetrahydromethanopterin S-methyltransferase, subunit B trimethylamine:corrinoid methyltransferase MttB 70_2375 tetrahydromethanopterin S-methyltransferase, subunit E trimethylamine corrinoid protein 70_2380 tetrahydromethanopterin S-methyltransferase, subunit F trimethylamine permease MttP 70_2381 tetrahydromethanopterin S-methyltransferase, subunit G dimethylamine corrinoid protein 70_2382 tetrahydromethanopterin S-methyltransferase, subunit H dimethylamine methyltransferase MtbB1 48_173 N5,N10-methylene-tetrahydromethanopterin reductase dimethylamine methyltransferase MtbB2 (F420-dependent) dimethylamine corrinoid protein 54_434 methylenetetrahydromethanopterin dehydrogenase dimethylamine methyltransferase 44_75 N5,N10-methenyltetrahydromethanopterin cyclohydrolase dimethylamine methyltransferase MtbB1 64_1122 formylmethanofuran:tetrahydromethanopterin dimethylamine corrinoid protein formyltransferase methyltransferase MtbB3 70_2741 formylmethanofuran dehydrogenase subunit A methylamine methyltransferase corrinoid protein 70_2744 formylmethanofuran dehydrogenase subunit B methylamine methyltransferase corrinoid protein 70_2742 formylmethanofuran dehydrogenase subunit C monomethylamine corrinoid protein 70_2743 formylmethanofuran dehydrogenase, subunit D monomethylamine methyltransferase 70_2739 formylmethanofuran dehydrogenase, subunit E monomethylamine methyltransferase 70_2740 formylmethanofuran dehydrogenase, subunit F monomethylamine corrinoid protein 52_360 coenzyme F420-reducing hydrogenase, β subunit monomethylamine methyltransferase MtmB 55_505 F420H2 dehydrogenase, subunit D methylcobalamin:CoM methyltransferase isozyme A 55_507 F420H2 dehydrogenase, subunit I methyl coenzyme M reductase R chain 48_174 F420H2 dehydrogenase, subunit F methyl coenzyme M reductase, protein D 55_503 F420H2 dehydrogenase subunit B methyl coenzyme M reductase, protein D 68_1645 F420H2 dehydrogenase subunit O energy production and conversion H(+)-transporting ATP synthase, subunit A 48_200 rubrerythrin H(+)-transporting ATP synthase, subunit B 68_1846 inorganic pyrophosphatase H(+)-transporting ATP synthase, subunit C 70_2404 sulfite reductase assimilatory type I H(+)-transporting ATP synthase, subunit E 55_460 ferredoxin H(+)-transporting ATP synthase, subunit F 48_205 ferredoxin-thioredoxin reductase H(+)-transporting ATP synthase, subunit H 58_682 Fe-S oxidoreductase H(+)-transporting ATP synthase, subunit I 70_2708 Fe-S oxidoreductase H(+)-transporting ATP synthase, subunit K 48_201 flavoprotein dihydrolipoamide dehydrogenase 48_204 flavoprotein A 2-oxoisovalerate ferredoxin oxidoreductase, R subunit 66_1415 indolepyruvate ferredoxin oxidoreductase R subunit Na+-transporting NADH:ubiquinone 51_296 iron-sulfur flavoprotein oxidoreductase subunit 1 68_1900 polyferredoxin rubrerythrin 68_1705 superoxide reductase

Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation.

4 in M. jannaschii, 100 in M. mazei and M. acetivorans, and 140 in S. solfataricus. The abundance of Tn enzymes in the cell would not be expected to be high. This was borne out by finding peptides for each Tn in only 1 LC-MS run. Moreover, the 2 Tns distribute to the left-hand-side of the correspondence analysis 1170

Journal of Proteome Research • Vol. 3, No. 6, 2004

plot (Figure 3c). This is a further indication that the LC-MS methods described are sensitive and capable of detecting low abundance proteins. To the best of our knowledge, this is the first functional genomics study reporting transposons expressed in archaea. Expression of the Tns indicates that transposition occurs during

research articles

Proteome of M. burtonii by LC/LC-MS/MS Table 6. Proteins Involved in Metabolism contig_gene no.a

42_43 42_46 42_47 70_2327 70_2325 65_1229 49_221 69_2108 43_53 46_123 68_1772

58_647 62_1004 59_748 69_2158 68_1648 70_2213 68_1742 69_2069 40_31

70_2667 58_653 70_2486 67_1620 58_665 60_781 69_2111 62_931 69_2113 55_484 70_2364 67_1611 70_2668 54_436 70_2669 68_1867 68_1677 69_2149 70_2573 70_2624 62_978 68_1838 68_1883 62_940 66_1344 70_2692 48_187 68_1724 48_179 48_184 48_183 70_2551 70_2399 70_2449 61_855 66_1470 63_1037 61_884 67_1613 70_2655 48_217 a

functional category

contig_gene no.a

functional category

carbon fixation and carbohydrate metabolism carbon monoxide dehydrogenase β subunit 68_1817 phosphoenolpyruvate synthase carbon monoxide dehydrogenase ∆ subunit 63_1111 enolase carbon monoxide dehydrogenase γ subunit 46_116 D-arabino 3-hexulose 6-phosphate pyruvate synthase R subunit formaldehyde lyase pyruvate synthase γ subunit 59_729 phosphomannomutase pyruvate carboxylase subunit B 68_1681 pyruvate phosphate dikinase pyruvate decarboxylase/acetolactate synthase 70_2238 carboxymuconolactone decarboxylase fructose/tagatose bisphosphate aldolase 48_196 carboxymuconolactone dehydrogenase fructose-bisphosphate aldolase 65_1245 glycosyl transferase fructose bisphosphate aldolase 70_2266 glycosyltransferase glyceraldehyde 3-phosphate dehydrogenase (phosphorylating) nucleotide metabolism adenylate kinase 55_483 dihydroorotate dehydrogenase electronadenylosuccinate synthase transfer subunit anaerobic ribonucleoside-triphosphate reductase 43_55 dihydroorotase anaerobic ribonucleoside-triphosphate reductase 70_2234 pyrimidine-nucleoside phosphorylase CTP synthase 58_681 GMP synthase (glutamine-hydrolyzing) nucleoside diphosphate kinase 67_1514 inosine-5′-monophosphate dehydrogenase phosphoribosylamine-glycine ligase 51_323 inosine-5′-monophosphate dehydrogenase phosphoribosylaminoimidazolecarboxamide 67_1490 ribulose-bisphosphate carboxylase (form formyltransferase III RUBISCO) large chain 2 phosphoribosylaminoimidazole69_2037 phosphoribosylformylglycinamidine synthase succinacarboxyamide synthase amino acid metabolism leucylaminopeptidase 68_1743 ornithine carbamoyltransferase membrane alanine aminopeptidase 67_1524 ornithine decarboxylase aminopeptidase 70_2295 ornithine acetyltransferase activator of 2-hydroxyglutaryl-CoA 48_206 D-3-phosphoglycerate dehydrogenase dehydratase 65_1326 γ-glutamyl phosphate reductase acetylglutamate kinase 67_1560 glycine hydroxymethyltransferase histidinol-phosphate aminotransferase 68_1873 homoserine dehydrogenase glutamate dehydrogenase 70_2490 ketol-acid reductoisomerase glutamate synthase (NADPH), 59_731 pyrroline-5-carboxylate reductase R subunit 69_2174 arginosuccinate synthase glutamine synthetase 70_2488 acetolactate synthase large subunit glutamate synthase 56_568 acetolactate synthase small subunit branched chain amino acid 57_591 aconitase aminotransferase 69_2172 carbamoly phosphate synthase 3-isopropylmalate dehydratase small subunit 3-isopropylmalate dehydratase 69_2173 carbamoyl-phosphate synthase 3-isopropylmalate dehydrogenase large subunit 3-isopropylmalate dehydrogenase 70_2313 5-oxoprolinase/hydantoinase isocitrate dehydrogenase (NADP) 68_1764 hydantoinase isopropylmalate synthase family protein 69_1959 hydantoinase serine-pyruvate aminotransferase 67_1597 histidine biosynthesis protein threonine synthase 55_482 S-adenosylmethionine synthetase asparagine synthase 62_962 dihydrodipicolinate synthase aspartate-semialdehyde dehydrogenase 62_963 dihydrodipicolinate reductase aspartate aminotransferase 54_452 phosphoserine phosphatase tryptophan synthase, β subunit 65_1319 phosphoserine phosphatase tryptophan synthase, subunit β 53_392 HesB protein coenzyme metabolism ∆-aminolevulinic acid dehydratase 66_1381 adenosylhomocysteinase cobyrinic acid a,c-diamide synthase 66_1345 glutamate-1-semialdehyde 2,1cobyric acid synthase aminomutase cobalamin biosynthesis protein 61_861 magnesium-chelatase subunit D precorrin-2 C20-methyltransferase 61_859 magnesium-chelatase subunit I precorrin-8X methylmutase 42_51 protoporphyrin IX magnesium chelatase precorrin-3B C17-methyltransferase 70_2237 nicotinate-nucleotide pyrophosphorylase riboflavin synthase subunit β (carboxylating) thiamine biosynthesis protein ThiC 68_1888 uroporphyrin-III C-methyltransferase pyridoxine biosynthesis protein 43_62 coenzyme F390 synthase metallo cofactor biosynthesis protein 48_189 phosphoribosylanthranilate isomerase metallo cofactor biosynthesis protein 69_2180 phosphoribosylaminoimidazole quinolinate synthase carboxylase molybdenum cofactor biosynthesis protein B unassigned GTP-binding protein 68_1702 integral membrane protein putative dehydrogenase 69_2057 GTP-binding protein archaeal kinase

Contig_Gene numbers are from the 11Dec03 release of the M. burtonii genome sequence annotation.

Journal of Proteome Research • Vol. 3, No. 6, 2004 1171

research articles

Figure 4. Cartoon depicting cellular processes occurring in M. burtonii growing at 4 °C. Numbers of identified proteins involved in each process are shown in parentheses. Colors are matched to gene arrangement (Figure. S1). DNA replication and processing (light brown), transcription (dark purple), translation (light purple), cell division (light orange), defense mechanisms (dark green), cell envelope (dark brown), PTM, protein turnover and chaperones (dark orange), motility and secretion (light green), signal transduction (red), transport (dark blue), methanogenesis (black), energy production and conversion (yellow), amino acid metabolism (dark pink), carbohydrate metabolism (dark gray), nucleotide metabolism (light blue), coenzyme metabolism (light pink), hypothetical proteins (light gray).

normal cell growth at 4 °C. A consequence of transposition will be gene disruption and the generation of isogenic mutants, which may be expressed in order to promote genetic diversity and provide a mechanism for competitive selection. The implications are that transposons in archaea are not only relics of previous evolutionary events, but they may be active and influencing the genetic makeup of their host cells. In Sulfolobus solfataricus, the presence of large numbers of IS elements has been linked to high rates of spontaneous mutation.25 In contrast, in Sulfolobus acidocaldarius, spontaneous mutation rates are low and genetic diversity is promoted through a marker exchange and recombination that is mediated by cellcell contact between isogenic strains.26 It will be valuable to experimentally examine rates of transposition in M. burtonii and other archaea and correlate this with genetic fitness. In this context, it is noteworthy that there were a number of independent indicators that mechanisms of competition and survival were active in the cell, including the identification of several restriction/modification enzymes, a β-lactamase and a catalase. Active transposition also has several practical implications. For genome sequencing projects, high rates of transposition will produce genome heterogeneity and may affect the ability to assemble, close and interpret genome sequence data. The expressed Tns also provide real opportunities for developing tools for performing mutagenesis in M. burtonii. Presently, liposome-mediated transformation procedures developed for Methanosarcinacea27 are being investigated in M. burtonii in an effort to develop a tractable gene transfer system (Sowers and Cavicchioli, unpublished results). Transcription, Signal Transduction and Motility. All RNAP subunits, with the exception of subunit E (above), 1 of the 2 transcription initiation factors, TATA-binding protein (TBP), transcription factor S (TFS), and multiprotein bridging factor 1 (MBF1) were expressed (Table 3). The antiterminator protein, NusG was also expressed, indicating that bacterial-like antitermination processes occur in archaea. While NusA and NusG homologues have been reported in genome sequences, func1172

Journal of Proteome Research • Vol. 3, No. 6, 2004

Goodchild et al.

tional studies have not been reported.23 The DEAD-box RNA helicase was identified in the expressed-proteome. Consistent with this, mRNA levels for this gene were previously found to be abundant during growth at 4 °C but not at 23 °C.28 Expression of this gene was shown to involve a long 5′-UTR that contained a cold box sequence present in cold shock protein and RNA helicase genes induced by cold shock in E. coli and Anabaena. In the expressed-proteome, 7 bacterial-like transcriptional regulators were identified. In addition, numerous members of two-component regulatory systems (2CRSs) were identified, including sensor-kinases, response-regulators and hybrid sensor-regulator proteins (Figure 5b). One of the response regulators (70_2500) was previously shown to be cold induced.2 The participation of eucaryotic-like, core RNAP machinery, a bacterial-like anttiterminator, and strong participation of bacterial-like transcriptional regulatory systems, not only illustrates a confounded evolutionary history of the transcription apparatus, but highlights the complexity of the interactions which must be coordinated for effective cell growth. One of the 2CRSs appears to be linked to chemotaxis. A methyl-accepting chemotaxis protein, CheW and CheY were identified in the expressed-proteome, and genes for CheA, CheC, CheD, and CheR are arranged in the same gene cluster (Figure 5c). 4 flagellin proteins were also expressed. M. burtonii is motile and possesses a single flagellum.13 The proteome data indicates that flagella are being synthesized at 4 °C, and are likely to be regulated by a bacterial-like, Che, chemotaxis system. The Che system is absent from the genomes of many archaea, including M. jannaschii. Despite this M. jannaschii responds to hydrogen partial pressure, growth phase and ammonium concentration, and novel systems of chemotaxis have been proposed.29,30 Other members of the Methanosarcinacea, M. mazei and M. acetivorans, contain the same gene cluster as M. burtonii, but have been reported not to be motile.31 To improve our understanding of chemotaxis in the Methanosarcinacea, it will be useful to carefully examine the biotic and abiotic factors which control motility and expression of flagellar and Che genes in M. burtonii and M. acetivorans. This will clarify why similar genetic systems in phylogenetically related organisms are regulated differently. Translation and Protein Folding. A significant proportion of the proteins in the expressed-proteome are involved in protein synthesis (Table 4). Proteins included 5 initiation factors, 2 elongation factors, 7 tRNA synthetases, 23 of the 27 predicted proteins that form the large ribosomal subunit, 19 of the 20 proteins that constitute the small ribosomal subunit and 3 ribosomal modification proteins. M. burtonii expressed 3 group II chaperonins which comprise the thermosome, the R- and β-prefoldins, the 3 chaperones DnaK, DnaJ, and GrpE, a Clp protease, a CDC48 homolog, the R-NAC subunit, thioredoxin, glutaredoxin, and the FKBPtype and cyclophilin-type peptidyl-prolyl cis-trans isomerases (PPIases) (Table 4). The mesophilic methanogen M. mazei encodes and expresses group I (GroEL, GroES) and group II (3 copies of Cpn60) chaperonins.32,33 In contrast, all other archaea examined encode 1-3 copies of cpn60.32 The advanced stage of sequencing of the M. burtonii genome (∼12× coverage with 37 major contigs) indicates that group I chaperonins are not likely to be encoded (although it will be valuable to close the genome in order to be certain). This indicates that the trend toward dual systems in mesophilic methanogens is not cor-

Proteome of M. burtonii by LC/LC-MS/MS

research articles

Figure 5. Organization of genes encoding proteins in the M. burtonii expressed-proteome. Expressed proteins (/) and gene numbers are shown. (A) Transposases. (B) Two-component regulatory systems. Sensor (Pfam PF01339, PF01739, PF00785, PF02743), open boxes; histidine-kinase (Pfam PF00512, PF002518), speckled; response-regulator (Pfam PF00072), backward hatch; predicted TMD, (narrow vertical line). (C) Chemotaxis and flagellin genes involved in motility. (D) Exosome and proteasome components. Gene organization and abbreviations used for the superoperon and related genes have been described.40 ACR, ancient conserved region; ArCR, archaeal conserved region; MTR, methyltransferase; PCS, proteasome catalytic subunit; PRS, proteasome regulatory subunit; exoPPH, exopolyphosphatase; mbl, metallo-β lactamase; Rpp, RNase P subunit; Rrp, ribosomal RNA-processing protein; RPC, RNA polymerase subunit; IMP4, component of eukaryotic U3 small ribonucleoprotein complex; Pfd, prefoldin; MBF, multiprotein bridging factor; L, large ribosomal subunit.

related with growth temperature, but may be a reflection of lateral gene transfer between bacteria and archaea.34 E. coli GroEL and GroES may be critical determinants for growth of the organism at low-temperature.35 M. burtonii is a cold adapted archaeon, and while it does not appear to encode group I chaperonins, it does express a broad range of proteins involved in protein folding and refolding (above). It was recently found that the mRNA and protein levels of the FKBPtype PPIase were more abundant during growth at 4 °C compared with 23 °C2, and we have now found that the cyclophilin-type of PPIase is also expressed. Collectively, the findings support the view that protein folding is of general importance for cold adaptation, and highlight the point that the rate-limiting steps and precise mechanisms are likely to vary between organisms, and therefore need to be examined carefully. RNA and Protein Processing. An archaeal-specific archaeosine tRNA ribosyltransferase was identified in the expressed-

proteome (Table 3). This is consistent with previous findings that specific tRNA modification occurs in M. burtonii, including the incorporation of high levels of dihydrouridine and lower levels of 16 other modified nucleosides.3 In eucaryotes, small nuclear ribonucleoproteins (snRNPs) are involved in diverse RNA processing events.36 The protein component of snRNPs consists of heptameric ring structures composed of Sm and Sm-like (Lsm) proteins. The Sm/Lsm proteins have a common Sm motif, and structural homologues are found in all domains of life.37 Despite the availability of structural information for an archaeal Lsm, little is known about the biological function of Lsm proteins.36 In a number of hyperthermophilic euryarchaeota and crenarchaetoa, the LsmR gene is immediately upstream of the ribosomal protein gene, L37e, and it has been speculated that the archaeal Lsm complex may fulfill a role in ribosomal function or biogenesis.36 The LsmR-L37e genes are arranged the same way in M. burtonii (Figure S1), and LsmR was detected in the expressed-proteome Journal of Proteome Research • Vol. 3, No. 6, 2004 1173

research articles (Table 3). As the LsmR-L37e gene organization is conserved from psychrophiles to hyperthermophiles, and in members of euryarchaeota and crenarchaeota, it would indicate that Lsm proteins have a fundamental cellular role in archaea. Lsmγ was also expressed in M. burtonii, raising the possibility that homoand hetero-oligomeric complexes are formed in M. burtonii during growth at 4 °C. Protein turnover in eucaryotes involves a multi-enzyme complex, the proteasome; a form of which has also been shown to be present in archaea.38 The R and β proteasome subunits were expressed in M. burtonii, along with a number of peptidases (Table 4). Eucaryotic and bacterial mRNA degradation involves multi-enzyme complexes, the degradosome and the exosome, respectively.39 However in archaea this has not been experimentally demonstrated. Koonin et al.40 have reported the presence of a conserved superoperon in archaea that encodes proteasome and exosome components, in addition to ribosomal and RNAP subunits, suggesting that protein and RNA synthesis and degradation may be coupled. A superoperon is present in M. burtonii (Figure 5d). 8 proteins from the superoperon and 5 proteins from related operons were expressed, including proteasome and exosome subunits (Tables 3 and 4). The synthesis of these proteins supports the proposed functional interaction of transcription and translation pathways in archaea, as well as their co-evolution.40 It will be valuable to purify native proteasome/exosome complexes from M. burtonii and examine their composition. This will be particularly valuable for determining if the cold induced RNA helicase28, deaD (69_2063), which is not part of the superoperon, assembles in an exosome/degradosome during growth at 4 °C or functions independently. In E. coli, the degradosome RNA helicase, RhlB, can be replaced by the cold-shock induced RNA helicase, CsdA.41 It is therefore important to clarify the role that RNA helicases and the multienzyme complexes play in the cold adaptation biology of both archaea and bacteria. Despite the abundance and biological importance of cold shock proteins (Csps) in bacteria, and their Y-box homologues in eucaryotes, they are not present in most archaea.1,42 Previous comparative genomics analyses revealed that while M. burtonii does not encode Csp homologues, 2 hypothetical proteins (Scaffold5.gene2863 and Scaffold3.gene1629) were predicted to contain cold shock domain (CSD) folds with best threading matches to the S1 RNA-binding domain of polynucleotide phosphorylase.1 These are genes 558 and 865, present and expressed in the superoperon system (Figure 5d). Their coexpression with exosome genes suggests they may bind-RNA and play a role in RNA-processing. Transport, Methanogenesis, Transport and Energy Production. M. burtonii is a methylotrophic methanogen, capable of methanogenesis using methylamines and methanol.13 In this study, cultures were grown in a complex medium with trimethylamine (TMA) as the sole carbon source. Consistent with this, the expressed-proteome contained 2 TMA permeases (Table 5). At least 19 other transport proteins were also expressed, including ABC transporters, a TolB transporter, cation transporters, and transporters for metals (iron, zinc, copper, cobalt, mercury and molybdenum). In many cases, the transport proteins were arranged in gene clusters with other transport-related proteins (Figure S1). A signal recognition particle protein (SRP) involved in transport to, and across the plasma membrane was also identified. However, despite the 1174

Journal of Proteome Research • Vol. 3, No. 6, 2004

Goodchild et al.

presence of genes involved in Sec pathways in the M. burtonii genome, no proteins were identified. Methanogenesis proteins were highly abundant in LC-MS analysis, and proteins involved in every step of TMA-methanogenesis, including genes involved in the reduction of methylS-CoM to methane and the oxidation of methyl-S-CoM to CO2, were identified. Methyltransferases and corrinoid proteins are substrate specific and mediate the transfer of the methyl group from the substrate to Coenzyme M. Multiple trimethylamine, dimethylamine and monomethylamine methyltransferases and at least 2 of their respective corrinoid proteins were identified. In addition, a methanol specific methyltransferase was expressed. A copy of the substrate-specific methyltransferase was often found in a gene cluster with a gene for its cognate corrinoid protein (Figure S1). However, a monomethylamine corrinoid protein (68_1755) was also found in a gene cluster containing genes specific to methanol utilization. The expression of multiple proteins that are not linked to the metabolism of that substrate would appear to be a wasteful process for the cell. However, it would enable M. burtonii to rapidly utilize the substrate when it becomes available, or when the carbon flux changes sufficiently in the environment. This strategy appears to be a feature of methylamine/methanol utilizing methanogens,43 and coexpression of these genes may be facilitated by the regulation of operons. The generation of protons by methanogenesis and the electron transport chain, drives ATP synthesis in methanogens.44 Similar to other archaea,44 M. burtonii appears to encode only one A1A0 ATPase. The genes are arranged in a cluster (Figure S1), and 8 of the 9 proteins were identified in the expressed-proteome, suggesting that the A1Ao ATPase provides the primary means of generating ATP in M. burtonii during growth at 4 °C. Metabolism. The CO2 produced from the oxidation of methyl-S-CoM during methanogenesis is converted to the central biosynthetic molecule, acetyl-CoA by carbon monoxide dehydrogenase (CODH). The β, ∆, and γ subunits of CODH were identified in the expressed-proteome, and the genes are arranged in a cluster on the genome (Figure S1, Table 6). CO2 fixation could also theoretically occur via the Calvin cycle. A number of archaea, however, do not appear to encode a gene for the important enzyme ribulose biphosphate carboxylase,45 and a complete Calvin cycle has not been demonstrated in archaea. In M. burtonii, a form III ribulose biphosphate carboxylase45 and 2 fructose bisphosphate aldolase proteins were identified in the expressed-proteome, providing a clear indication that this archaeon is able to fix carbon using enzymes in the Calvin cycle. The major biosynthetic pathways in the cell are derived from pyruvate. Enzymes involved in the important coversion of acetyl-CoA to pyruvate that were identified in the expressedproteome included pyruvate synthase, pyruvate carboxylase, and pyruvate decarboxylase. Several enzymes involved in a non-oxidative pentose phosphate pathway (67_1490 and 69_2108 and 43_53) were identified. Moreover, a number of gluconeogenesis enzymes (phosphoenolpyruvate synthetase, enolase, glyceraldehyde 3-phosphate dehydrogenase, and aldolases) were identified, indicating that this pathway is used for the conversion of pyruvate to carbohydrates for use as cell wall and storage material. Glycogen is one of the storage polymers synthesized by methanogens, being utilized when substrates becomes limiting.46,47 In addition to enzymes involved in methanol utilization (above), 2 glycosyltransferases were ex-

research articles

Proteome of M. burtonii by LC/LC-MS/MS

pressed, indicating that alternative substrate utilization pathways appear to be functioning even under conditions of nutrient-excess in complex growth medium.

burtonii expressed-proteome. This material is available free of charge via the Internet at http://pubs.acs.org.

Methanogens do not appear to have the genetic capacity to perform a complete oxidative or reductive TCA cycle. As a result, key biosynthetic intermediates are thought to be synthesized through an incomplete TCA cycle. The M. burtonii expressed-proteome contained a number of enzymes that constitute an incomplete oxidative TCA cycle including pyruvate carboxylase, aconitase, and isocitrate dehydrogenase. These findings demonstrate that methylotropic methanogens, and not just acetogenic methanogens as was previously thought,48 utilize an incomplete oxidative TCA cycle. This is likely to be important for feeding the large number of amino acid biosynthetic pathways which appear to be operating in M. burtonii during growth at 4 °C (Table 6).

References

Conclusion High-throughput, high-sensitivity LC-MS methods supported by the draft genome of M. burtonii have facilitated the identification of a large number of proteins from the expressedproteome. Knowledge of the expressed proteins has advanced the level of understanding of the biology of the cell from coding potential to actual process. As a consequence, logical programs of future research could be identified, many of which would not have been obvious from purely genomic analyses. Conversely, comparative genomics underpinned a focus on coupled transcription-translation processing,40 which may otherwise have not been considered. This study clearly illustrates the synergy which can be achieved in biology from an integrated genomic-proteomic approach. The important contribution that cold-adapted organisms make to the Earth’s biosphere has become increasingly apparent. This has resulted from an emphasis placed on the ecology of psychrophiles, their mechanisms of adaptation, their potential for biotechnological exploitation and their implied relevance in astrobiology.49-54 In addition to M. burtonii, genome sequence data is, or will soon be available for coldadapted archaea (Cenarchaeum symbiosum and Methanogenium frigidum) and cold-adapted bacteria (Bacillus cereus, Colwellia psychroerythraea, Desulfotalea psychrophila, Photobacterium profundum, Pseudoalteramonas haloplanktis, Psychrobacter sp., and Shewanella frigidimarina). The capacity to successfully process proteins from M. burtonii demonstrates the in-principle means of performing proteomic studies on these organisms, thereby providing the enticing prospect of gaining a truly broad view of microbial cold adaptation biology.

Acknowledgment. Thanks to Frank Larimer, Miriam Land and Paul Richardson for ensuring availability and up-todate annotation of M. burtonii sequence data, and to Paul Curmi, Jason Kahn, and Haluk Ertan for helpful discussions. The work was supported by the Australian Research Council. Mass spectrometric analysis for the work were carried out at the Bioanalytical Mass Spectrometry Facility, UNSW, and was supported in part by grants from the Australian Government Systemic Infrastructure Initiative and Major National Research Facilities Program (UNSW node of the Australian Proteome Analysis Facility) and by the UNSW Capital Grants Scheme. Supporting Information Available: Supporting Figure S1: Organization of genes encoding all proteins in the M.

(1) Saunders, N.; Thomas, T.; Curmi, P. M. G.; Mattick, J. S.; Kuczek, E.; Slade, R.; Davis, J.; Franzmann, P. D.; Boone, D.; Rusterholtz, K.; Feldman, R.; Gates, C.; Bench, S.; Sowers, K.; Kadner, K.; Aerts, A.; Dehal, P.; Detter, C.; Glavina, T.; Lucas, S.; Richardson, P.; Larimer, F.; Hauser, L.; Land, M.; Cavicchioli, R. Genome Res. 2003, 13, 1580-1588. (2) Goodchild, A.; Saunders, N. F. W.; Ertan, H.; Raftery, M.; Guilhaus, M.; Curmi, P. M. G.; Cavicchioli, R. 2004, 53, 309-321. (3) Noon, K. R.; Guymon, R.; Crain, P. F.; McCloskey, J. A.; Thomm, M.; Lim, J.; Cavicchioli, R. J. Bacteriol. 2003, 185, 54835490. (4) Cavicchioli, R.; Goodchild, A.; Raftery, M. In Microbial Proteomics-Functional Biology of Whole Organisms; Humphery-Smith, I., Hecker, M., Eds.; John Wiley & Sons: New Jersey, 2004; in press. (5) Bult, C. J.; White, O.; Olsen, G. J.; Zhou, L.; Fleischmann, R. D.; Sutton, G. G.; Blake, J. A.; FitzGerald, L. M.; Clayton, R. A.; Gocayne, J. D.; Kerlavage, A. R.; Dougherty, B. A.; Tomb, J. F.; Adams, M. D.; Reich, C. I.; Overbeek, R.; Kirkness, E. F.; Weinstock, K. G.; Merrick, J. M.; Glodek, A.; Scott, J. L.; Geoghagen, N. S.; Venter, J. C. Science 1996, 273, 1058-1073. (6) Giometti, C. S.; Reich, C.; Tollaksen, S.; Babnigg, G.; Lim, H.; Zhu, W.; Yates III, J.; Olsen, G. J. Chromatogr. B 2002, 782, 227243. (7) Forbes, A. J.; Patrie, S. M.; Taylor, G. K.; Kim, Y.-B.; Jiang, L.; Kelleher, N. L. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 26782683. (8) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates III, J. R. Nat. Biotechnol. 1999, 17, 676682. (9) Washburn, M. P.; Ulaszek, R.; Deciu, C.; Schieltz, D. M.; Yates, J. R., III Anal. Chem. 2002, 74, 1650-1657. (10) Wolters, D. A.; Washburn, M. P.; Yates, J. R., III Anal. Chem. 2001, 73, 5683-5690. (11) Choudhary, G.; Wu, S. L.; Shieh, P.; Hancock, W. S. J. Proteome Res. 2003, 2, 59-67. (12) Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R., III Nat. Biotechnol. 2003, 21, 532-538. (13) Franzmann, P. D.; Springer, N.; Ludwig, W.; Conway de Macario, E.; Rohde, M. System Appl. Microbiol. 1992, 15, 573-581. (14) Gatlin, C. L.; Kleemann, G. R.; Hays, L. G.; Link, A. J.; Yates, J. R., III Anal. Biochem. 1998, 263, 93-101. (15) Tabb, D. L.; McDonald, W. H.; Yates, J. R., III J. Proteome Res. 2002, 1, 21-26. (16) McInerney, J. O. Bioinformatics 1998, 14, 372-373. (17) Claudel-Renard, C.; Chevalet, C.; Faraut, T.; Kahn, D. Nucleic Acids Res. 2003, 31, 6633-6639. (18) McInerney, J. O. Microb. Compar. Genomics 1997, 2, 1-10. (19) Lafay, B.; Atherton, J. C.; Sharp, P. M. Microbiology 2000, 146, 851-860. (20) Grabowski, B.; Kelman, Z. Annu. Rev. Microbiol. 2003, 57, 487516. (21) Robinson, N. P.; Dionne, I.; Lundgren, M.; Marsh, V. L.; Bernander, R.; Bell, S. D. Cell 2004, 116, 25-38. (22) Bernander, R. Mol. Microbiol. 2003, 48, 599-604. (23) Reeve, J. N. Mol. Microbiol. 2003, 48, 587-598. (24) White, M. F. Biochem. Soc. Trans. 2003, 31, 690-693. (25) Martusewitsch, E.; Sensen, C. W.; Schleper, C. J. Bacteriol. 2000, 182, 2574-2581. (26) Ghane, F.; Grogan, D. W. Microbiology (UK) 1998, 144, 16491657. (27) Metcalf, W. W.; Zhang, J. K.; Apolinario, E.; Sowers, K. R.; Wolfe, R. S. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 2626-2631. (28) Lim, J.; Thomas, T.; Cavicchioli, R. J. Mol. Biol. 2000, 297, 553567. (29) Mukhopadhyay, B.; Johnson, E. F.; Wolfe, R. S. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 11522-11527. (30) Giometti, C. S.; Reich, C. I.; Tollaksen, S. L.; Babnigg, G.; Lim, H.; Yates, J. R., III; Olsen, G. J. Proteomics 2001, 1, 1033-1042. (31) Galagan, J. E.; Nusbaum, C.; Roy, A.; Endrizzi, M. G.; Macdonald, P.; FitzHugh, W.; Calvo, S.; Engels, R.; Smirnov, S.; Atnoor, D.; Brown, A.; Allen, N.; Naylor, J.; Stange-Thomann, N.; DeArellano, K.; Johnson, R.; Linton, L.; McEwan, P.; McKernan, K.; Talamas, J.; Tirrell, A.; Ye, W.; Zimmer, A.; Barber, R. D.; Cann, I.; Graham, D. E.; Grahame, D. A.; Guss, A. M.; Hedderich, R.; Ingram-Smith, C.; Kuettner, H. C.; Krzycki, J. A.; Leigh, J. A.; Li, W.; Liu, J.;

Journal of Proteome Research • Vol. 3, No. 6, 2004 1175

research articles

(32)

(33) (34)

(35) (36) (37) (38) (39)

1176

Mukhopadhyay, B.; Reeve, J. N.; Smith, K.; Springer, T. A.; Umayam, L. A.; White, O.; White, R. H.; Conway de Macario, E.; Ferry, J. G.; Jarrell, K. F.; Jing, H.; Macario, A. J.; Paulsen, I.; Pritchett, M.; Sowers, K. R.; Swanson, R. V.; Zinder, S. H.; Lander, E.; Metcalf, W. W.; Birren, B. Genome Res. 2002, 12, 532542. Klunker, D.; Haas, B.; Hirtreiter, A.; Figueiredo, L.; Naylor, D. J.; Pfeifer, G.; Muller, V.; Deppenmeier, U.; Gottschalk, G.; Hartl, U. F.; Hayer-Hartl, M. J. Biol. Chem. 2003, 278, 3325633267. Figueiredo, L.; Klunker, D.; Ang, D.; Naylor, D. J.; Kerner, M. J.; Georgopoulos, C.; Hartl, U. F.; Hayer-Hartl, M. J. Biol. Chem. 2004, 279, 1090-1099. Deppenmeier, U.; Johann, A.; Hartsch, T.; Merkl, R.; Schmitz, R. A.; Martinez-Arias, R.; Henne, A.; Wiezer, A.; Baumer, S.; Jacobi, C.; Bruggemann, H.; Lienard, T.; Christmann, A.; Bomeke, M.; Steckel, S.; Bhattacharyya, A.; Lykidis, A.; Overbeek, R.; Klenk, H. P.; Gunsalus, R. P.; Fritz, H. J.; Gottschalk, G. J. Mol. Microbiol. Biotechnol. 2002, 4, 453-461. Ferrer, M.; Chernikova, T. N.; Yakimov, M. M.; Golyshin, P. N.; Timmis, K. N.; Nat. Biotechnol. 2003, 21, 1266-1267. Collins, B. M.; Harrop, S. J.; Kornfeld, G. D.; Dawes, I. W.; Curmi, P. M. G.; Mabbutt, B. C. J. Mol. Biol. 2001, 309, 915-923. Sauter, C.; Basquin, J.; Suck, D. Nucleic Acids Res. 2003, 31, 40914098. Maupin-Furlow, J. A.; Ferry, J. G. J. Biol. Chem. 1995, 270, 2861728622. Symmons, M. F.; Williams, M. G.; Luisi, B. F.; Jones, G. H.; Carpousis, A. J. TIBS 2002, 27, 11-18.

Journal of Proteome Research • Vol. 3, No. 6, 2004

Goodchild et al. (40) Koonin, E. V.; Wolf, Y. I.; Aravind, L. Genome Res. 2001, 11, 240252. (41) Beran, R. K.; Simons, R. W. Mol. Microbiol. 2001, 39, 112-125. (42) Cavicchioli, R.; Thomas, T.; Curmi, P. M. G. Extremophiles 2000, 4, 321-331. (43) Ding, Y.-H.; Zhang, S.-P.; Tomb, J.-F.; Ferry, J. G. FEMS Microbiol. Lett. 2002, 215, 127-132. (44) Schafer, G.; Engelhard, M.; Muller, V. Microbiol. Mol. Biol. Rev. 1999, 63, 570-620. (45) Finn, M. W.; Tabita, F. R. J. Bacteriol. 2003, 185, 3049-3059. (46) Murray, P. A.; S. H. Zinder. Nature 1984, 312, 284-286. (47) Pellerin, P.; Gruson, B.; Prensier, G.; Albagnac, G.; Debiere, P. Arch. Microbiol. 1987, 146, 377-381. (48) Simpson, P. G.; Whitman, W. B. Methanogenesis; Ferry, J. G., Ed.; Chapman & Hall, New York, 1993; pp 445-472. (49) Karner, M. B.; DeLong, E. F.; Karl, D. M. Nature 2001, 409, 507510. (50) Cavicchioli, R. Astrobiology 2002, 2, 281-292. (51) Cavicchioli, R.; Siddiqui, K. S.; Andrews, D.; Sowers, K. R. Curr. Opin. Biotechnol. 2002, 13, 253-261. (52) Thomas, D. N.; Dieckman, G. S. Science 2002, 295, 641-644. (53) Feller, G.; Gerday, C. Nat. Rev. Micro. 2003, 1, 200-208. (54) Lohan, D.; Johnston, S. The international regime for bioprospecting, Existing policies and emerging issues for Antarctica. United Nations University and Institute of Advanced Studies (UNU/IAS), 2003.

PR0498988