Proteomic and Computational Analysis of Secreted Proteins with Type

Seven proteins possessing a classical class 1 signal peptide were identified in the supernatant from cultures grown at 4 and 23 °C. The proteins incl...
16 downloads 11 Views 212KB Size
Proteomic and Computational Analysis of Secreted Proteins with Type I Signal Peptides from the Antarctic Archaeon Methanococcoides burtonii Neil F. W. Saunders,† Charmaine Ng,† Mark Raftery,‡ Michael Guilhaus,‡ Amber Goodchild,§ and Ricardo Cavicchioli*,† School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia, Bioanalytical Mass Spectrometry Facility, The University of New South Wales, Sydney, NSW 2052, Australia, and Johnson and Johnson Research, Australian Technology Park, Eveleigh, NSW,1430, Australia Received May 8, 2006

LC-MS/MS was used to identify secreted proteins in the Antarctic archaeon Methanococcoides burtonii. Seven proteins possessing a classical class 1 signal peptide were identified in the supernatant from cultures grown at 4 and 23 °C. The proteins included a putative S-layer cell surface protein, cell surface protein involved with cell adhesion, and trypsin-like serine protease. Protease activity was detected in the secreted fraction, and the signal peptide cleavage site of the protease was confirmed using Edman sequencing. The expression profile of putative cell surface proteins suggests a requirement for cell interactions during growth at low temperature. Sequences of the secreted proteins were used to compile a dataset containing a further 32 predicted secreted proteins from the Methanosarcinaceae. Many of these proteins were also S-layer cell surface proteins with a variety of predicted roles, particularly in cell-cell interaction. Computational analysis of signal peptides revealed a preference for lysine in the n-region, leucine in the h-region, and a eucaryal-type cleavage site, highlighting the mosaic nature of signal peptides in Archaea. This is the first study to experimentally characterize secreted proteins from a cold-adapted archaeon and provides new insight and a functional dataset for studying secretion in Archaea. Keywords: LC-MS/MS • secretion • supernatant fraction • type 1 signal peptides • archaea • cold adaptation • psychrophile • protease • S-layer protein • SignalP

Introduction Secretion of proteins across cellular membranes is a fundamental biological process existing in all domains of life. The best studied and most conserved secretion pathway is the classical Sec pathway.1 Proteins destined for Sec export are synthesized with a signal peptide that contains a positively charged n-region, a hydrophobic h-region, and a cleavage site (c-region). The signal peptide directs the preprotein to the cytoplasmic membrane where it is processed by a signal peptidase. Proteins with a class 1 signal peptide are cleaved and either released on the external side of the membrane or anchored via a C-terminal membrane anchor. Proteins with a class 2 signal peptide are cleaved at a consensus site named the lipo box and attached to the membrane via a lipid-modified N-terminal cysteine residue. Distinct from these two classes * To whom correspondence should be addressed: Ricardo Cavicchioli, School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia. Tel.: +61-2-93853516. Fax: +612-93852742. E-mail: [email protected]. † School of Biotechnology and Biomolecular Sciences, The University of New South Wales. ‡ Bioanalytical Mass Spectrometry Facility, The University of New South Wales. § Johnson and Johnson Research. 10.1021/pr060220x CCC: $33.50

 2006 American Chemical Society

are class 3 signal peptides, which are cleaved prior to the h-region. In class 3, the hydrophobic residues are retained in the mature protein and are essential for membrane anchoring and subunit interaction in Type IV pilus-like structures. Compared with Eucarya and Bacteria, far less is known about the details of the Sec pathway in Archaea. Archaea contain the universally conserved Sec pore (named Sec61 in Archaea and Eucarya, and SecYEG in Bacteria), the crystal structure of which has been determined in the archaeon Methanocaldococcus jannaschii.2 Comparative genomics has revealed that many archaea possess homologues of the bacterial pore-associated subunits YidC3 and SecDF,4 components of the eucaryal oligosaccharyltransferase subunit,5 the signal recognition particle SRP that binds to the signal peptide of cotranslationally translocated proteins, and the membrane-bound SRP receptor.6 In many ways, the archaeal Sec complex appears as a hybrid of eucaryal and bacterial components. Archaeal class 1 and 3 (but not class 2) signal peptidases have been identified, and in some cases activity has been demonstrated using in vitro assays.7,8 However, little experimental data are available concerning the cleavage site in archaeal class 1 signal peptides. Computational methods have been developed to define signal peptides in eucarya and Gram-positive and Journal of Proteome Research 2006, 5, 2457-2464

2457

Published on Web 08/02/2006

research articles Gram-negative bacteria,9 although it is unclear which models are most applicable to archaea. Most computational studies to date have attempted to define archaeal signal peptides using a consensus of the results obtained by applying the three existing models to protein sequences from archaeal genomes.10,11 These results have indicated that archaeal class 1 signal peptides may contain a bacterial-like n-region, an h-region of unique composition, and a eucaryal-like cleavage site. Methanococcoides burtonii is an Antarctic methylotrophic methanogen in the family Methanosarcinacea (Topt 23 °C, Tmin < 4 °C), that was isolated from a perennially cold methanesaturated Antarctic lake.12 It is the best-characterized psychrophilic archaeon,13 and extensive genomic and proteomic data have been generated.14-19 The Sec components are present in the M. burtonii draft genome (November, 2004 release), with the exception of the YidC and the SRP receptor components which have not been identified, indicating that Sec secretion involving cleavage of class 1 signal peptides should occur. Methanosarcina spp. that are phylogenetically related to M. burtonii (e.g., M. acetivorans and M. thermophila) can grow as single cells and as aggregates.20 Methanosarcina spp. aggregates are formed by a heteropolysaccharide sheath that forms over the S-layer of individual cells, grouping cells together into packets. Clumps of M. burtonii cells have also been observed during growth studies,12 although they do not appear to form packets in the same way as Methanosarcina spp. It is unclear what proteins may be involved in clumping and aggregation in the Methanosarcinacea, or how they would be secreted. Proteomics has not previously been reported for examining secretion in archaea.21 An aim of this study was to gain a better understanding of low-temperature cellular physiology of M. burtonii by determining which proteins were secreted, and if the composition of the secreted proteins was modulated by growth at 4 versus 23 °C, temperatures that have previously been used for comparative proteomics to examine cold adaptation.15,16 To identify secreted proteins, LC-MS/MS was used. After establishing a functional dataset of secreted proteins, an important aim was to analyze the structure of signal peptides in M. burtonii. The functional dataset was also used to predict secreted proteins with signal peptides in related archaea, and this proved successful for a set of proteins from the Methanosarcinaceae.

Experimental Procedures Organisms and Culture Conditions. Methanococcoides burtonii was grown at 4 °C (∼80 h doubling time) and 23 °C (∼20 h doubling time) in liquid modified methanogen medium (MFM) with trimethylamine under anaerobic conditions in a gas phase of 80:20 N2/CO2 as previously described.12,16 Sample Preparation and Analysis. Cells were harvested in late logarithmic phase (absorbance 0.2-0.25 at 600 nm) by centrifugation at 4470g for 25 min at 4 °C, and the supernatant fraction was reserved. The supernatant fraction was filtered once through a Millex 0.22 µm filter (Millipore). In addition to the supernatant fraction (secreted proteins), the whole cell fraction was prepared from the cell pellet as described previously.16 The supernatant and whole cell fractions were each concentrated by centrifugation through Centricon Plus-20 filtration devices (Millipore, 2 mL capacity). Protein concentration was determined using the Bradford assay.22 Proteins were visualized on 12% SDS-PAGE gels23 using Coomassie24 or silver stain.25 The integrity of the cell pellet was determined by 2458

Journal of Proteome Research • Vol. 5, No. 9, 2006

Saunders et al.

measuring the activity ratio of the cytoplasmic enzyme glutamate dehydrogenase (GDH) in the soluble and supernatant fractions.16,26 Tryptic Digestion, LC-MS/MS, and Protein Identification. Distinct protein bands visualized on the SDS-PAGE gels were excised. In-gel digestion using trypsin was performed as previously described.16 Total protein in the supernatant fraction was digested by incubating 100 µL of sample, 25 µL of 10 mM NH4HCO3, and 1 µg of trypsin at 37 °C for 14 h. The digested peptides were separated by nano-LC using an Ultimate HPLC and Famos autosampler system (LC-Packings). A volume of 5 µL of digested peptide sample was loaded onto a C18 precolumn (500 µm × 2 mm, Micron) with H2O/CH3CN (98:2, 0.1% (v/v) formic acid) at 20 µL min-1. After a 4 min wash, the precolumn was switched on-line with the analytical column containing C18 RP silica (75 µm × 15 cm, PEPMAP). The peptides were eluted using H2O/CH3CN (40:60, 0.1% formic acid) at 0.2 µL min-1 over 30 min. The nanospray needle was positioned ∼1 cm from the orifice of the API QStar Pulsar I tandem MS (ABI), which was operated in an informationdependent acquisition mode. A time-of-flight (TOF) MS survey scan was acquired (m/z 350-1700, 0.75 s). The two largest multiply charged ions (counts >15) were sequentially selected by Q1 for MS-MS analysis. Tandem mass spectra were accumulated for 2.5 s (m/z 50-2000). Identification of peptides using the M. burtonii protein sequence database was performed using Mascot as previously described.14 Mascot results were subjected to manual verification, and identities were accepted if they met the criteria of a cutoff score of 18 or more, standard errors were generally 40% identity to a query sequence. SignalP analysis as described above and manual inspection were used to confirm that the hit sequences contained an N-terminal signal peptide.

Results Identification of Abundant Proteins from the Supernatant. The ratio of GDH activity in the supernatant versus the cell pellet fraction was 1:11 and 1:12 at 4 and 23 °C, respectively, indicating minimal cell lysis and good sample preparation. Abundant proteins from the supernatant of cells grown at 4

research articles

Secreted Proteins from Methanococcoides burtonii

Figure 1. Coomassie-stained SDS-PAGE gels of supernatant fraction from cultures of M. burtonii grown at 4 and 23 °C. Protein ZP•00563902.1 (bands 1, 2, and 3 correspond to molecular weights of 34, 32, and 22 kDa, respectively) was present at both temperatures. Lane 1, size standards; lane 2, 4 °C supernatant fraction; lane 3, 4 °C whole cell fraction; lane 4, 23 °C supernatant fraction; lane 5, 23 °C whole cell fraction; lane 6, BSA standard. The gel shown has been edited from the original to show only relevant lanes.

Figure 2. Predicted protein sequence of protease ZP•00563902.1. Residues in bold show the signal peptide sequence. The closed and dashed boxes show the N-terminal sequences determined by Edman sequencing for the 22 and 34 kDa protein bands, respectively. LC-MS/MS peptides for the 22 and 34 kDa protein bands are shown by full and dashed lines below peptide residues. The LC-MS/MS peptides for the 32 kDa protein band are shown by italicized residues.

and 23 °C were visualized using Coomassie-stained SDS-PAGE gels. Bands of molecular mass 34, 32, and 22 kDa were observed in samples from both growth temperatures (Figure 1). Each of these bands were excised and proteins identified using LCMS/MS. The 34, 32, and 22 kDa bands all corresponded to one protein, ZP•00563902.1, with a predicted molecular weight for the mature protein of 39 kDa (Table 1). ZP•00563902.1 was annotated as a protease (see below). From LC-MS/MS analysis, two peptides corresponding to different regions of the protein sequence of ZP•00563902.1 were identified per band (Figure 2). The 34 and 32 kDa bands had peptide sequences which were in the C-terminal half of the protein, while the 22 kDa band contained peptides from the N-terminal half of the protein.

Identification of Total Proteins in the Supernatant Fraction. Total protein from the supernatant fraction of 4 and 23 °C cultures was identified using LC-MS/MS. To maximize proteome coverage, the supernatant fractions from both growth temperatures were each analyzed five times by LC-MS/MS. A total of 7 protein identifications were made for proteins with predicted signal peptides (see Functional Annotation of Secreted Proteins below) using samples from the two culture temperatures. Of these, 3 proteins were only present at 4 °C, 2 only at 23 °C, and 2 proteins were present at both temperatures (Table 1). Because of the use of strict Mascot score criteria (see Experimental Procedures), low scoring peptides were not included, and it is possible that very low-abundance proteins may have been present but not identified. Protein ZP•00563902.1 was identified in all five LC-MS/ MS runs of the supernatant fractions from both growth temperatures (Table 1) and was a prominent band on Coomassie-stained SDS-PAGE gels (Figure 1, described above), indicating it was one of the most abundant secreted proteins. Protein ZP•00561707.1 was identified in three of the five LCMS/MS runs at 4 °C, and ZP•00563397.1 in four runs at 23 °C. Their consistent presence at 4 or 23 °C indicates they may play a specific physiological role at those temperatures. Functional Annotation of Secreted Proteins. Proteins identified using LC-MS/MS were annotated by searching BLAST and InterPro databases. SignalP and TMHMM were used to identify signal peptides and transmembrane helices. The N-terminus of 7 proteins contained features characteristic of a signal peptide (Figure 3). Positively charged n-regions ranged in length from 3 to 9 amino acid residues, followed by a hydrophobic h-region and a cleavage site at which the -1 residue was exclusively Ala. In all but one protein, complete agreement for the position of the cleavage site was predicted using all three models (eucaryal, Gram-negative, and Grampositive) of both SignalP-NN and SignalP-HMM. For protein ZP•00562452.1, low cleavage site scores were predicted by all methods. However, position 29 was identified as the most likely cleavage site, as it was predicted using SignalP-NN and each of the eucaryal, Gram-negative, and Gram-positive models. Reliable functional annotations were obtained for three of the secreted proteins (Table 1). Two proteins were annotated as cell surface proteins, based on the presence of DUF1608/ S-layer related duplication domain (ZP•00562202.1) or bacterial Ig-like and invasin/intimin cell adhesion domains (ZP•00561707.1). Protein ZP•00562202.1 also contained a predicted C-terminal transmembrane helix. Protein ZP•00563397.1 contained a weak C-type lectin signature predicted from a profile scan of the Prosite database. Protein ZP•00563902.1 contained a characteristic signature of serine/

Table 1. Identification and Annotation of Secreted Proteins from the Supernatant Fraction of Methanococcoides burtonii Grown at 4 and 23 °C proteina

temperatureb

LC-MS/MS runsc

(1) ZP•00562202.1 (2) ZP•00561707.1 (3) ZP•00563902.1d (4) ZP•00562452.1 (5) ZP•00563397.1 (6) ZP•00562343.1 (7) ZP•00562351.1

4 4 4, 23 4 23 23 4, 23

1 3 5e 1 4 1 1e

motifs

annotation

DUF1608/S-layer related Trimeric LpxA-like Bacterial Ig-like group 1 Invasin/intimin cell adhesion Trypsin-like serine/cysteine protease Unknown function DUF1628 C-type lectin

Cell surface, S-layer protein Cell surface protein Peptidase Conserved archaeal protein Hypothetical Hypothetical Hypothetical

a GenPept accession number. b Growth temperature(s) (°C) at which the protein was detected. c Number of times the protein was identified from a total of five LC-MS/MS runs per growth temperature. d Also identified from bands on SDS-PAGE. e The protein was identified from the indicated number of LCMS/MS runs for each growth temperature.

Journal of Proteome Research • Vol. 5, No. 9, 2006 2459

research articles

Saunders et al.

Figure 3. Predicted signal peptides in proteins from Methanosarcinaceae. Sequences are aligned at the predicted cleavage site (-); the n-region is indicated in bold, and the h-region is underlined. Sequences are ordered for comparison with Table 1 (M. burtonii) and Table 2 (other Methanosarcinaceae).

cysteine proteases. ZP•00562452 was a conserved archaeal hypothetical protein containing the DUF1628 domain, and the remaining two proteins, ZP•00562343.1 and ZP•00562351.1, were hypothetical proteins with no recognizable motifs or domains. Protease Activity and Signal Peptide Cleavage Site of Protein ZP•00563902.1. Protein ZP•00563902.1 was predicted to be a protease, based on the presence of InterPro signature IPR009003, found in serine and cysteine proteases. The supernatant fraction from cells cultured at 23 °C exhibited 0.041 and 0.093 U of activity with azoprotein at 4 and 23 °C, respectively, and the Savinase control had an activity of 0.181 U at 23 °C. As the Savinase used was a purified enzyme (compared to the supernatant which represented a mixture of secreted proteins), 2460

Journal of Proteome Research • Vol. 5, No. 9, 2006

it indicates that the specific activity of protease activity from M. burtonii was relatively high. The 34 and 22 kDa bands identified as protein ZP•00563902.1 on SDS-PAGE (Figure 1) were excised, and Edman sequencing was performed to determine their N-terminal sequence. The N-terminal sequence of the 22 kDa fragment was ESSD(S), consistent with the predicted cleavage site (Figures 2 and 3). The N-terminal sequence of the 34 kDa band was SRTTR, which mapped to the central region of the protein sequence (Figure 2). All three bands that corresponded to ZP•00563902.1 appear to be cleavage products of the mature proteins and may have arisen from autoproteolysis. Annotation of Predicted Secreted Proteins from Other Methanosarcinaceae. Sequences of the seven secreted proteins

research articles

Secreted Proteins from Methanococcoides burtonii Table 2. Annotation of Predicted Secreted Proteins from Methanosarcinacea* S-layer proteina

NP•615788.11 NP•615835.11 NP•615843.11 NP•615910.1d,e,1 NP•618438.11 NP•618474.11 NP•618514.11 NP•632491.11 NP•633388.1d,1 NP•633840.11 NP•634000.11 YP•304582.11 YP•305104.11 YP•305280.11 YP•305296.11 YP•305336.11 YP•305337.1f,1 YP•305525.11 YP•305530.1g,h,1 YP•306611.1d,1

conserved archaeal proteinb

NP•633535.14 YP•304930.14

hypothetical proteinc

NP•615283.13 NP•615288.13 NP•615528.17 NP•617874.16 NP•633599.13 NP•633747.13 YP•304843.13 YP•305908.13 YP•305990.13 YP•306618.1i,3

*The IPR prefix refers to InterPro database accession number. 1 -7Protein from M. burtonii (Table 1) used as BLAST query that matched proteins from other Methanosarcinacea. a Contains S-layer-related duplication domain, DUF1608 (IPR006457). b Contains archaeal domain of unknown function DUF1628 (IPR012859). c Contains no domains that provide useful information. d Contains PKD domain (IPR000601). e Contains disaggregatase-related repeat (IPR010671). f Contains pectin lyase-like fold (IPR011050). g Contains carboxypeptidase regulatory region (IPR008969). h Contains cupredoxin fold (IPR008972). i Contains PBS lyase HEAT-like repeat (IPR004155).

from M. burtonii were used to search a BLAST database of protein sequences from all current complete archaeal genomes. Proteins with >40% identity were screened for predicted signal peptides using SignalP as detailed in Experimental Procedures. This procedure generated a dataset of 32 predicted secreted proteins from M. acetivorans, Methanosarcina barkeri, and Methanosarcina mazei. These proteins were annotated using iprscan (Table 2). The majority of the proteins were identified as cell surface proteins containing the S-layer related duplication/DUF1608 domain. Several of these proteins contained additional domains or motifs. Proteins NP•615910.1, NP•633388.1, and YP•306611.1 contained one or more PKD domains, an extracellular domain involved with adhesive protein-protein or protein-carbohydrate interaction. In addition, protein NP•615910.1 contained a disaggregatase-related repeat. Protein YP•305337.1 contained a pectin lyase-like fold, and protein YP•305530.1 contained two domains characteristic of a metallocarboxypeptidase. Proteins NP•633535.1 and YP•304930.1 from M. mazei and M. barkeri, respectively, were similar to the conserved archaeal hypothetical protein ZP•00562452.1 from M. burtonii, and each contained the DUF1628 domain. The remaining 10 predicted secreted proteins from the Methanosarcinaceae could not be assigned a putative function and contained no recognizable domains, with the exception of protein YP•306618.1 from M. barkeri, which contained 4 copies of a HEAT repeat domain. Analysis of Predicted Signal Peptides in Proteins from Methanosarcinaceae. Signal peptides of the 39 predicted secreted proteins from M. burtonii and the other Methanosarcinaceae were partitioned manually into n-, h-, and c-regions and aligned by cleavage site (Figure 3). Several features of the signal peptides were apparent. There was considerable variation in the length of the n-region, from 3 to 12 amino acid

residues. Lys was more common in the n-region than Arg (67 versus 24 occurrences), and these amino acids often occurred as doublets. The hydrophobic h-regions ranged in length from 15 to 25 amino acid residues. The dominant amino acid in the h-region was Leu (154 occurrences), followed by Ala (98), Ile (97), and Val (70). In the c-region (defined as positions -3 to +3 each side of the cleavage site), the -3 position was dominated by Ala (23/ 39) and Val (9/39). Ser was commonest at the -2 position (17/ 39), and the -1 position was exclusively Ala (34/39) or Gly (5/ 39). The two most common amino acids at the +1, +2, and +3 positions were Ala and Gln(14/39 and 7/39), Asp and Pro (10/39 and 6/39), and Asn and Ser (7/39 and 10/39), respectively.

Discussion In this study, seven proteins that contained type 1 signal peptides that were secreted beyond the cytoplasmic membrane were identified. Isolation of bands from SDS-PAGE led to the identification of a single abundant protein, a putative protease, that was secreted during growth at both 4 and 23 °C. The signal sequence cleavage site for this protein was confirmed by N-terminal Edman sequencing. LC-MS/MS analysis of the supernatant fraction identified the same putative protease, and a further six proteins with predicted class 1 signal peptides. The functional annotation of the 7 secreted proteins was used to gain a better understanding of the possible biological roles of secreted proteins in M. burtonii and to predict related secreted proteins in closely related members of the Methanosarcinacea. Roles of Secreted Proteins in M. burtonii. Using a comprehensive set of annotation methods, we were able to assign putative functions to at least three secreted proteins from M. burtonii. Protein ZP•00563902.1 was identified as a putative protease. The protease domain was identified with high confidence, and significant activity was detected in the supernatant fraction using azocasein as substrate. Secreted proteases have been identified in other archaea,8,32,33 one of which was activated by autoproteolysis.32 The presence of three bands corresponding to the protease on SDS-PAGE (Figure 1) indicates that autoproteolytic cleavage also occurs for this enzyme from M. burtonii. The identification of a secreted protease in this cold-adapted archaeon highlights the biotechnological potential of psychrophilic archaea, an area of research that has received little attention.34,35 It will be valuable to study the structure/function/stability and substrate specificity of this enzyme with a view to determining its role in the cell, molecular characteristics of cold adaptation, and possible applied uses. Protein ZP•00562202.1 was identified as a putative cell surface S-layer protein. Similar proteins have been identified previously as substrates for class 1 Sec secretion, and in common with these proteins, ZP•00562202.1 contains a predicted C-terminal transmembrane helix which probably functions as a membrane anchor.36 Protein ZP•00562202.1 also contained a trimeric LpxA-like domain. This domain is composed of tandem repeats of a hexapeptide arranged in a lefthanded β-helix fold and is found in a wide variety of biosynthetic enzymes, precluding an exact functional prediction for ZP•00562202.1. Protein ZP•00561707.1 contained signatures characteristic of two domains that mediate cell adhesion in bacterial proteins. The detection of the protein only at 4 °C suggests a requirement for cell adhesion or aggregation at low temperature, consistent Journal of Proteome Research • Vol. 5, No. 9, 2006 2461

research articles with the clumping of M. burtonii cells that was previously observed12 and has recently been observed for cells growing at -2.5 °C.37 No transmembrane helix domains were detected in this protein, suggesting that association with the exterior of the cell is mediated by another means, such as protein-protein interaction. No function could be assigned to the remaining four secreted proteins identified from M. burtonii. Proteins ZP•00562343.1 and ZP•00562351.1 contained no recognizable motifs or domains. Protein ZP•00562452.1 contained a conserved domain of unknown function, DUF1628, also identified in proteins NP•633535.1 from M. mazei and YP•304930.1 from M. barkeri. Protein ZP•00563397.1 contained a weak C-type lectin signature, identified using a regular expression scan of the Prosite database. If confirmed, this might also imply a cell surface carbohydrate binding function. Notably, protein ZP•00563397.1 was the only protein in our study for which the N-terminal residue of the mature protein is predicted to be Cys. The region around the cleavage site is somewhat similar to a lipo box motif, suggesting that the protein may be a candidate for a novel archaeal class 2-like secreted lipoprotein. In addition, the protein was only detected at 23 °C, which might suggest a role in membrane adaptation at higher growth temperature, consistent with changes in lipid composition that occur during growth at 4 versus 23 °C.17 Roles of Secreted Proteins in other Methanosarcinaceae. Using our dataset of experimentally verified secreted proteins from M. burtonii, we were able to identify 32 proteins from other members of the Methanosarcinaceae which, by sequence similarity and SignalP analysis, can be confidently predicted as secreted. The majority of these proteins (20/32) were S-layerrelated, containing one or more copies of the S-layer-related duplication/DUF1608 domain. Additional domains were detected in five of the proteins. A common feature was the PKD domain, found in three of these five proteins. The structure of this domain has been solved for a related S-layer protein from M. mazei.38 It contains an Ig-like fold and is found in the extracellular region of both archaeal surface proteins and metazoan cell surface or extracellular matrix proteins, implying an ancestral domain that has evolved to function in cell-cell interaction in archaea and eucarya. In addition to PKD, protein NP•615910.1 from M. acetivorans contained a disaggregataserelated repeat domain. In M. mazei, disaggregatase is localized to the cell membrane or cell wall.39 It is involved in disaggregation of cells at certain phases of the life cycle40 and probably has the same role in other Methanosarcinaceae. The remaining three proteins in which domains were identified were all from M. barkeri. Protein YP•305337.1 contained a pectin lyase-like fold. This alone was insufficient to assign a functional role, as the fold is common to several classes of protein including pectolytic enzymes, galacturonidases, chondroitinases, and adhesive virulence factors.41 The fold is suggestive of a carbohydrate-binding function. Protein YP•305530.1 contained two clearly-defined domains in the C-terminal region: a carboxypeptidase D regulatory domain followed by a cupredoxin-like fold. While a eucaryal protein with these features has been identified as a membranebound carboxypeptidase,42 bacterial and archaeal sequences with the signature are annotated variously as cell surface proteins with possible roles as receptors or in nutrient binding. The presence of a cupredoxin domain next to the carboxypep2462

Journal of Proteome Research • Vol. 5, No. 9, 2006

Saunders et al.

tidase domain in the M. barkeri protein is unique among the archaea, and a definitive function cannot be assigned to this protein. Protein YP•306618.1 contained four copies of a feature, named the PBS lyase-like HEAT repeat. This is a short bi-helical repeat found in a family of lyase enzymes that are involved with biosynthesis of phycobiliproteins, part of a light-harvesting complex in cyanobacteria and red algae.43 Its significance in Methanosarcinaceae is unknown given that they are nonphotosynthetic, but the domain is found in several proteins, including an expressed hypothetical M. burtonii protein.18 The remainder of the secreted proteins from the Methanosarcinaceae contained no recognizable domains and could therefore be classified only as hypothetical proteins of unknown function. In summary, our analysis indicates that many cell surface proteins in the Methanosarcinaceae are substrates of the class 1 Sec pathway with a range of functions, particularly in cell-cell contact. The number of secreted proteins with no predicted function is an indication of how much there is yet to learn regarding substrates for secretion in archaea. Features of the Signal Peptide in Proteins of Methanosarcinaceae. Edman sequencing of protein bands corresponding to protein ZP•00563902.1 revealed that the N-terminus of the mature protein corresponded with that predicted by all three models of SignalP, using both SignalP-NN and SignalP-HMM. This provided justification for assigning the cleavage site from SignalP prediction based on a consensus between the results from each model, an approach that has previously been used.11 In our analysis, we looked for agreement in the position of the predicted cleavage site between at least two of the predictive models, using either SignalP-NN or SignalP-HMM. For our dataset, complete agreement between three models in cleavage site position was observed for 27/39 proteins and agreement between two models for the remaining 12/39 proteins. Alignment of predicted signal peptides at their cleavage site revealed a number of features. The length of the positively charged n-region varied from 3 to 12 amino acid residues. An interesting observation was that the S-layer proteins seem to contain a subset of sequences that have a very short n-region, with the consensus sequence Met-Lys-(Gly/Lys/Arg/Thr), followed most commonly by Phe (5 cases), Tyr (1 case), or Ala (1 case). Conversely, long n-regions were found in several proteins, including the conserved archaeal DUF1628 family. However, no relationship between n-region length and protein function could be discerned. In most of the n-regions, Lys was the dominant positively charged residue, as was the case in a previous study using predicted signal peptides from M. jannaschii.11 However, in contrast to that study, Leu as opposed to Ile was the predominant hydrophobic amino acid in h-regions from our dataset, accounting for 21.9 mol % of all h-regions followed by Ala (13.9%), Ile (13.8%), and Val (9.9%). In h-regions of signal peptides from M. burtonii proteins, equal proportions of Leu and Ile were present (16.2%), followed by Ala (15.4%) and Val (14.5%). Previous studies have addressed the question of whether the archaeal signal peptide cleavage site is more bacterial-like or eucaryal-like in nature. In our dataset, Ala was the dominant residue at the -3 and -1 positions, but the tolerance for other residues (particularly Val at -3 and Gly at -1) is a more eucaryal feature. The preference for Ala at +1, Pro or negative charge at +2, and Ser at +3 is in agreement with a previous comparative genomic analysis of predicted archaeal signal peptides.10 In contrast to the analysis of M. jannaschii,11 our

Secreted Proteins from Methanococcoides burtonii

dataset showed no evidence for an excess of Tyr at +1 or -2, but Ser, which also has a hydroxyl side chain, was dominant at -2. Signal Peptides and Secretion Mechanisms in Archaea. Overall, the signal peptides in our dataset support the assertion that archaeal signal peptides contain bacterial, eucaryal, and archaeal features. The n- and h-regions resemble those of bacterial signal peptides, though the relative abundance of Lys in the n-region may be characteristic of Methanosarcinaceae and other archaea, while the cleavage site is eucaryal in nature. A consensus derived from the application of all current predictive models to archaeal sequences, coupled with more experimentally determined cleavage sites, therefore, seems currently to be the most effective way of predicting archaeal signal peptides. Our dataset is necessarily limited, as it is derived from sequence similarity to a small number of proteins with type 1 signal peptides and has been confined to Methanosarcinaceae. We are confident of the accuracy of most of the cleavage site predictions. Using a consensus of the three SignalP models will result in selection for signal peptides that are a hybrid of eucaryal and bacterial features, as noted previously.11 Combining our data and data from other sources may allow the development of predictive models that are more specific for archaeal class 1 signal peptides. However, this may not be easy, as the similarities between archaeal, bacterial, and eucaryal signal peptides are significant, and it is unclear whether an archaeal model will be sensitive enough to discriminate archaeal signal peptides from the latter two. Superimposed on this overall level of structural similarity is the fact that archaea are represented by a diverse group of extremophiles, and genus, or even species-specific adaptations, may arise in the amino acid composition of secreted proteins and in secretory pathways. For example, in Haloarchaea, the alternative Tat secretion pathway,44 in which the signal peptide contains a twin-arginine motif, is used extensively as an adaptation to high salinity.45 In contrast, M. burtonii contains only a TatA homologue (ZP•00562400.1), and no M. burtonii protein sequences contain the twin-arginine motif. This, in fact, raises the question of how (or whether) folded proteins are translocated across the cell membrane in M. burtonii. We also note that the h-region of signal peptides from M. burtonii contains relatively less Leu than the dataset as a whole, which may result from the decreased abundance of Leu in psychrophilic methanogens.19 There is clearly a great need to expand the database of experimentally verified secreted proteins in archaea, in order to define conserved and adaptation-specific characteristics of secretion.

Acknowledgment. The research was supported by the Australian Research Council. Mass spectrometric analysis for the work were carried out at the Bioanalytical Mass Spectrometry Facility, UNSW, and was supported in part by grants from the Australian Government Systemic Infrastructure Initiative and Major National Research Facilities Program (UNSW node of the Australian Proteome Analysis Facility) and by the UNSW Capital Grants Scheme. Analyses performed for Edman sequencing was facilitated by access to the Australian Proteome Analysis Facility established under the Australian Government’s Major National Research Facilities program. References (1) Pohlschro¨der, M.; Hartmann, E.; Hand, N. J.; Dilks, K.; Haddad, A. Annu. Rev. Microbiol. 2005, 59, 91-111.

research articles (2) Van den Berg, B.; Clemons, W. M., Jr.; Collinson, I.; Modis, Y.; Hartmann, E.; Harrison; S. C.; Rapoport, T. A. Nature 2004, 427, 36-44. (3) Yen, M. R.; Harley, K. T.; Tseng, Y. H.; Saier, M. H., Jr. FEMS Microbiol. Lett. 2001, 204, 223-231. (4) Hand, N. J.; Klein, R.; Laskewitz, A.; Pohlschro¨der, M. J. Bacteriol. 2006, 188, 1251-1259. (5) Eichler, J.; Adams, M. W. Microbiol. Mol. Biol. Rev. 2005, 69, 393425. (6) Lichi, T.; Ring, G.; Eichler, J. Eur. J. Biochem. 2004, 271, 13821390. (7) Bardy, S. L.; Jarrell, K. F. Mol. Microbiol. 2003, 50, 1339-1347. (8) Ng, S. Y.; Jarrell, K. F. J. Bacteriol. 2003, 185, 5936-5942. (9) Bendtsen, J. D.; Nielsen, H.; von Heijne, G.; Brunak, S. J. Mol. Biol. 2004, 340, 783-795. (10) Bardy, S. L.; Eichler, J.; Jarrell, K. F. Protein Sci. 2003, 12, 18331843. (11) Nielsen, H.; Brunak, S.; von Heijne, G. Protein Eng. 1999, 12, 3-9. (12) Franzmann, P. D.; Springer, N.; Ludwig, W.; Conway de Macario E.; Rohde, M. Syst. Appl. Microbiol. 1992, 15, 573-581. (13) Cavicchioli, R. Nat. Rev. Microbiol. 2006, 4, 331-343. (14) Goodchild, A.; M. Raftery; Saunders, N. F. W.; Guilhaus, M.; Cavicchioli, R. J. Proteome Res. 2004, 3, 1164-1176. (15) Goodchild, A.; Raftery, M.; Saunders, N. F. W.; Guilhaus, M.; Cavicchioli, R. J. Proteome Res. 2005, 4, 473-480. (16) Goodchild, A.; Saunders: N. F. W.; Ertan, H.; Raftery, M.; Guilhaus, M.; Curmi, P. M. G.; Cavicchioli, R. Mol. Microbiol. 2004, 53, 309-321. (17) Nichols, D. S.; Miller, M. R.; Davis, N. W.; Goodchild, A.; Raftery, M.; Cavicchioli, R. J. Bacteriol. 2004, 186, 8508-8515. (18) Saunders, N. F. W.; Goodchild, A.; Raftery, M.; Guilhaus, M.; Curmi, P. M. G.; Cavicchioli, R. J. Proteome Res. 2005, 4, 464472. (19) Saunders, N.; Thomas, T.; Curmi, P. M. G.; Mattick, J. S.; Kuczek, E.; Slade, R.; Davis, J.; Franzmann, P. D.; Boone, D.; Rusterholtz, K.; Feldman, R.; Gates, C.; Bench, S.; Sowers, K.; Kadner, K.; Aerts, A.; Dehal, P.; Detter, C.; Glavina, T.; Lucas, S.; Richardson, P.; Larimer, F.; Hauser, L.; Land, M.; Cavicchioli, R. Genome Res. 2003, 13, 1580-1588. (20) Ferry, J. G. In Methanogenesis: Ecology, Physiology, Biochemistry and Genetics; Ferry, J. G., Ed.; Chapman and Hall: New York, 1993; pp 304-334. (21) Cavicchioli, R.; Goodchild, A.; Raftery, M. In Microbial Proteomicss Functional Biology of Whole Organisms; Humphery-Smith, I., Hecker, M., Eds.; Wiley: New York, 2006; (in press). (22) Bradford, M. M. Anal. Biochem. 1976, 72, 248-254. (23) Laemmli, U. K. Nature 1970, 227, 680-685. (24) Diezel, W.; Kopperschlager, G.; Hofmann, E. Anal. Biochem. 1972, 48, 617-620. (25) Blum, H. Electrophoresis 1987, 8, 93-99. (26) Ertan, H. Arch. Microbiol. 1992, 158, 35-41. (27) Lantz, M. S.; Ciborowski, P. Methods Enzymol. 1994, 235, 563594. (28) Sarath, G. In Proteolytic Enzymes: A Practical Approach; Beynon R. J., Bond J. S., Eds.; Oxford University Press: Oxford, 1989; pp 25-55. (29) McGinnis, S.; Madden, T. L. Nucleic Acids Res. 2004, 32, W2025. (30) Quevillon, E.; Silventoinen, V.; Pillai, S.; Harte, N.; Mulder, N.; Apweiler, R.; Lopez, R. Nucleic Acids Res. 2005, 33, W116-120. (31) Mulder, N. J.; Apweiler, R.; Attwood, T. K.; Bairoch, A.; Bateman, A.; Binns, D.; Bradley, P.; Bork, P.; Bucher, L.; Cerutti, R.; Copley, E.; Courcelle, U.; Das, R.; Durbin, W.; Fleischmann, P.; Gough, J.; Haft, D.; Harte, N.; Hulo, N.; Kahn, D.; Kanapin, A.; Krestyaninova, M.; Lonsdale, D.; Lopez, R.; Letunic, I.; Madera, M.; Maslen, J.; McDowall, J.; Mitchell, A.; Nikolskaya, A. N.; Orchard, S.; Pagni, M.; Ponting, C. P.; Quevillon, E.; Selengut, J.; Sigrist, C. J.; Silventoinen, V.; Studholme, D. J.; Vaughan, R.; Wu, C. H. Nucleic Acids Res. 2005, 33, D201-205. (32) Elsztein, C.; Herrera Seitz, M. K.; Sa´nchez, J. J.; de Castro, R. E. J. Basic Microbiol. 2001, 41, 319-327. (33) Gimenez, M. I.; Studdert, C. A.; Sanchez J. J.; de Castro, R. E. Extremophiles 2000, 4, 181-188. (34) Cavicchioli, R.; Siddiqui, K. S.; Andrews, D.; Sowers, K. R. Curr. Opin. Biotechnol. 2002, 13, 253-261. (35) Siddiqui, K. S.; Cavicchioli, R. Annu. Rev. Biochem. 2006, 75, 403433. (36) Eichler, J. Eur. J. Biochem. 2001, 268, 4366-4373.

Journal of Proteome Research • Vol. 5, No. 9, 2006 2463

research articles (37) Reid, I. N.; Sparks, W. B.; Lubow, S.; McGrath, M.; Livio, M.; Valenti, J.; Sowers, K. R.; Shukla, H. D.; MacAuley, S.; Miller, T.; Suvanasuthi, R.; Belas, R.; Colman, A.; Robb, F. T.; DasSarma, P.; Mu ¨ ller, J. A.; Coker, J. A.; Cavicchioli, R.; Chen, F.; DasSarma, S. Int. J. Astrobiol. 2006, in press. (38) Jing, H.; Takagi, J.; Liu, J. H.; Lindgren, S.; Zhang, R. G.; Joachimiak, A.; Wang, J. H.; Springer, T. A. Structure 2002, 10, 1453-1464. (39) De Macario, E. C.; Macario, A. J.; Mok, T.; Beveridge, T. J. J. Bacteriol. 1993, 175, 3115-3120. (40) Xun, L. Y.; Mah, R. A.; Boone, D. R. Appl. Environ. Microbiol. 1990, 56, 3693-3698.

2464

Journal of Proteome Research • Vol. 5, No. 9, 2006

Saunders et al. (41) Jenkins, J.; Mayans, O.; Pickersgill, R. J. Struct. Biol. 1998, 122, 236-246. (42) Tan, F.; Rehli, M.; Krause, S. W.; Skidgel, R. A. Biochem. J. 1997, 327, 81-87. (43) Fairchild, C. D.; Glazer, A. N. J. Biol. Chem. 1994, 269, 86868694. (44) Yen, M. R.; Tseng, Y. H.; Nguyen, E. H.; Wu, L. F.; Saier, M. H., Jr. Arch. Microbiol. 2002, 177, 441-450. (45) Rose, R. W.; Bruser, T.; Kissinger, J. C.; Pohlschroder, M. Mol. Microbiol. 2002, 45, 943-950.

PR060220X