Identification of a Novel Family of Snake Venom Proteins Veficolins

Feb 16, 2010 - Department of Biological Sciences, 14 Science Drive 4, National University of Singapore, ... University, Richmond, Virginia 23298-0614...
3 downloads 0 Views 9MB Size
Identification of a Novel Family of Snake Venom Proteins Veficolins from Cerberus rynchops Using a Venom Gland Transcriptomics and Proteomics Approach G. OmPraba,†,‡ Alex Chapeaurouge,†,§ Robin Doley,†,# K. Rama Devi,⊥ P. Padmanaban,⊥ C. Venkatraman,⊥ D. Velmurugan,‡ Qingsong Lin,| and R. Manjunatha Kini*,†,∇ Protein Sciences Laboratory, Department of Biological Sciences, 14 Science Drive 4, National University of Singapore, Singapore 117543, Centre for Advanced Studies in Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600 025, India, Laborato´rio de Toxinologia, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, RJ 21045-900, Brazil, Zoological Survey of India, Marine Biological Station, Chennai 600 028, India, Department of Biological Sciences, 14 Science Drive 4, National University of Singapore, Singapore 117543, and Department of Biochemistry and Molecular Biology, Medical College of Virginia, Virginia Commonwealth University, Richmond, Virginia 23298-0614 Received November 13, 2009

Cerberus rynchops (dog-faced water snake) belongs to Homalopsidae of Colubroidea (rear-fanged snakes). So far, venom compositions of snakes of the Homalopsidae family are not known. To determine the venom composition of C. rynchops, we have used both transcriptomics and proteomics approaches. The venom gland transcriptome revealed 104 ESTs and the presence of three known snake protein families, namely, metalloprotease, CRISP, and C-type lectin. In addition, we identified two proteins that showed sequence homology to ficolin, a mammalian protein with collagen-like and fibrinogenlike domains. We named them as ryncolin 1 and ryncolin 2 (rynchops ficolin) and this new family of snake venom proteins as veficolins (venom ficolins). On the basis of its structural similarity to ficolin, we speculate that ryncolins may induce platelet aggregation and/or initiate complement activation. To determine the proteome, the whole C. rynchops venom was trypsinized and fractionated by reverse phase HPLC followed by MALDI-MS/MS analysis of the tryptic peptides. Analysis of the tandem mass spectrometric data indicated the presence of all protein families compared to the translated cDNA library. Overall, our combined approach of transcriptomics and proteomics revealed that C. rynchops venom is among the least complex snake venom characterized to date despite the presence of a new family of snake venom proteins. Keywords: Venom transcriptome • venomics • Venom proteome • snake venom toxicity • innate immunity

Introduction Snake venoms are complex mixtures of biologically active peptides and proteins which bind specifically to exogenous targets, such as receptors and ion channels. Thus, snake venom proteins have not only significantly contributed to the understanding of basic physiological systems, like the characterization of acetylcholine receptors by R-bungarotoxin1,2 and the complement system by cobra venom factor (CVF),3 but also led to the development of original diagnostic agents4 as well * To whom correspondence should be addressed. R. Manjunatha Kini, Protein Sciences Laboratory, Department of Biological Sciences, 14 Science Drive 4, National University of Singapore, Singapore 117543. E-mail: [email protected]. † Protein Sciences Laboratory, National University of Singapore. ‡ University of Madras. § Laborato´rio de Toxinologia, Instituto Oswaldo Cruz. # Current address: Department of Molecular Biology and Biotechnology, Tezpur University, Assam, India. ⊥ Zoological Survey of India, Marine Biological Station. | Department of Biological Sciences, National University of Singapore. ∇ Virginia Commonwealth University.

1882 Journal of Proteome Research 2010, 9, 1882–1893 Published on Web 02/16/2010

as therapeutic agents (for example, antihypertensive drug captopril).5,6 With the advent of new technologies, we are now able to identify new toxins from smaller amounts of venoms or venom gland tissues. Such studies will foster our understanding of function and evolution of venomous systems and accelerate developments in drug discovery. Cerberus rynchops is a nocturnal Colubrid snake encountered in mangroves and brackish rivers in Asia and Australia.7 It belongs to Homalopsidae family, and to date, venom compositions of snakes of Homalopsidae family are not well characterized. Homalopsids feed on frogs, fish, and tadpoles and are phylogenetically of particular interest since recent studies indicate that they represent a basal colubrid family.8,9 To identify toxin components and understand the possible pharmacological effects, we have investigated the venom composition of C. rynchops using a combined transcriptomic and proteomic approach. Our results indicate that the venom of C. rynchops contains only four different protein families, which can be classified as metalloprotease, C-type lectin, CRISP, and a novel snake venom family referred to as veficolins (venom 10.1021/pr901044x

 2010 American Chemical Society

Identification of a Novel Family of Snake Venom Proteins ficolins). On the basis of its sequence homology to ficolin, we predict that they initiate platelet aggregation and/or complement activation. Interestingly, this is one of the least complex snake venom reported so far10 and might shed light on the initial stages of snake venom evolution.

Materials and Methods Venom Extraction. C. rynchops snakes were collected from back waters of Madras coastal area, Chennai, India. The species was identified with the help of Zoological Survey of India, Marine Biological Station, Chennai, India. Venom was collected from the rear fangs of the snake through the specimen biting down on a sterilized watch glass. Before venom extraction, pilocarpine (2%; 3-5 drops) was applied in the mouth of the snake to induce venom gland secretion. The venom was collected twice with a 2 weeks time between the milkings. The venom was dissolved in HPLC grade H2O for sample transfer, frozen, lyophilized and stored at -80 °C prior to use. Polyethylene materials (pipet tips, Eppendorf tubes) were exclusively used to handle and store the venom due to the strong affinity of some peptides for glass and polystyrene. The third day after the second venom collection, the snake was anesthetized with ketamine hydrochloride (10 mg/kg). Subsequently, the snake was decapitated and the venom gland was carefully dissected and extracted from the surrounding tissues. Construction of cDNA Library from C. rynchops Venom Gland Tissue. The venom glands were kept in RNAlater solution immediately after dissection and stored at -80 °C until further use. Total RNA was extracted from 30 mg of venom gland tissue using the RNeasy Mini kit (Qiagen, Hilden, Germany). The integrity of the total RNA was examined by running 1% agarose gel electrophoresis while purity and sample concentration were determined spectrophotometrically. Enriched full-length, double-stranded (ds) cDNA was prepared from 150 ng of mRNA using SMART cDNA library construction kit (Clontech, Palo Alto, CA). The double stranded cDNA was ligated into pGEM-T Easy vector using the TA cloning approach. The ligation products were transformed into Escherichia coli DH5R competent cells and plated on LB/Amp/IPTG/ X-gal plates for blue/white screening. Positive colonies were selected and plasmids were isolated using the QIAprep spin Miniprep kit (Qiagen, Hilden, Germany). Purified plasmids were submitted to restriction enzyme analysis using EcoRI. Clones containing more than 200 bp cDNA inserts were selected for sequencing. DNA Sequencing and Analysis. DNA sequencing was performed using T7 and SP6 sequencing primers. The sequencing reaction was carried out using the BigDye terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA), with an Automated DNA sequencer (Model 3100A; Applied Biosystems). The adaptor sequences were trimmed and the open reading frame was identified using GENE RUNNER software. Signal peptide was identified using the online SignalP 3.0 server. The sequences were compared to those in the GenBank and SwissProt databases using online BLASTp and BLASTn programs. Multiple sequence alignments were carried out using ClustalW and DNAMAN and, if necessary, manually edited. Sequences were clustered according to their putative functions as observed in BLAST result. Sequences that did not match any known toxins were annotated as unknown sequences. Amplification of Metalloproteinase. On the basis of the 5′ and 3′ end sequence of the metalloproteinase (obtained during sequencing of cDNA clones), the upstream primer with start

research articles codon (M.P1-F, 5′-GGTAACTATATGCTTAGTGGGTTTCCC-3′) and downstream primer with stop codon (M.P1-R, 5′-GGAAGCGTTGCCTTTGAACACACCAGCTC-3′) were designed. The gene specific cDNA of metalloproteinase was amplified from the pool of venom gland cDNAs by conducting PCR using Taq DNA Polymerase and these primers. The thermal cycling parameters were as follows: 35 cycles of denaturation at 96 °C for 30 s, annealing at 65 °C for 30 s and extension at 68 °C for 3 min. The amplified metalloproteinase cDNA was separated by electrophoresis on 1% agarose gel. The metalloproteinase cDNA (2.4 kb) was extracted from the gel using the QIAquick gel extraction kit. T-A cloning, heat-shock transformation, plasmid isolation and DNA sequencing were done as mentioned above. Ten clones were picked randomly and checked for insertion using EcoRI digestion and subjected to DNA sequencing from both ends. To complete the sequence another pair of primers (M.P2-F, 5′-GGTTGTGGACAACAGTATG-3′ and M.P2-R, 5′-CATCTTGAGATACTGTTGC-3′) were designed which are complementary to the middle region. The full-length metalloproteinase gene was compiled by overlapping nucleotide sequences using the program DNAMAN. Amplification of Ryncolins. To determine various isoforms of ryncolins, forward and reverse primers (R.P1-F, 5′-CCTGATTTTCCTGGTTGCTAGTTC-3′ and R.P1-R, 5′-CTGGTCATGTTGACGAA TCTATG-3′) were designed based on the 5′ and 3′ ends of the ORF of ryncolins. PCR amplification was performed using Taq DNA Polymerase to amplify the gene specific cDNA of ryncolin using the venom gland cDNA as a template. The thermal cycling parameters were as follows: 35 cycles of denaturation at 96 °C for 30 s, annealing at 56 °C for 30 s and extension at 68 °C for 2 min. The amplified ryncolin cDNA was separated by electrophoresis on 1% agarose gel. The ryncolin cDNA (1.2 kb) was extracted from the gel using the QIAquick gel extraction kit. T-A cloning, heat-shock transformation, and finally plasmid isolation were carried out. Nearly, 400 clones were randomly picked and 60 clones showing ryncolin gene inserts were subjected to DNA sequencing using SP6 and T7 primers. In-Solution Tryptic Digestion. Lyophilized crude venom (500 µg) was initially dissolved in 500 µL of ammonium bicarbonate and precipitated with 4 vol of ice-cold acetone for 3 h at -15 °C. Afterward, the pellet was brought up in 1 mL of 50 mM ammonium bicarbonate containing 20 µL of 1% ProteasMAX surfactant followed by reduction with 10 µL of 0.5 M DTT (at 56 °C for 20 min) and alkylation with 27 µL of 0.55 M iodoacetamide at room temperature for 15 min in the dark. Finally, the venom was subjected to digestion (at 37 °C for 3 h) by adding 18 µL of trypsin (1 µg/µL) and 10 µL of 1% ProteasMAX surfactant to improve tryptic cleavage activity. HPLC Separation and Mass Spectrometric Analysis. After reducing the volume of the tryptic digest to 200 µL using a Speedvac (Savant SC 110A), the solution was directly applied onto a C-18 reversed phase column (Agilent Zorbax 300SB-C18 1.0 × 150 mm × 3.5 µm). Elution of the tryptic peptides was performed by running a linear gradient from 0% acetonitrile to 72% acetonitrile in 120 min applying a flow rate of 40 µL/ min (solvent A contains 0.1% TFA, while solvent B contains 80% acetonitrile in 0.1% TFA). During the chromatographic run, 120 samples were manually collected and spotted onto the MALDI sample plate. Approximately 0.3 µL of the sample solution was mixed with the same volume of a saturated matrix solution (R-cyano-4-hydroxycinnamic acid, (Aldrich, Milwaukee, WI) 10 mg/mL in 50% acetonitrile/0.1% trifluoroacetic Journal of Proteome Research • Vol. 9, No. 4, 2010 1883

research articles

OmPraba et al.

Figure 1. Experimental approaches of the characterization of the venom of C rynchops.

acid) on the target plate and allowed to dry at room temperature. Raw data for protein identification were obtained on the 4700 Proteomics Analyzer (Applied Biosystems, Foster City, CA). Both MS and MSMS data were acquired with a neodymiumdoped yttrium aluminum garnet (Nd:YAG) laser with a 200-Hz repetition rate. Typically, 1600 shots were accumulated for spectra in MS mode, while 3500 shots were accumulated for spectra in MS/MS mode. Up to 20 of the most intense ion signals with signal-to-noise ratio above 30 were selected as precursors for MS/MS acquisition excluding common trypsin autolysis peaks and matrix ion signals. External calibration in MS mode was performed using a mixture of four peptides: desArg1-Bradykinin (m/z ) 904.468), angiotensin I (m/z ) 1296.685), Glu1-fibrinopeptide B (m/z ) 1570.677), and ACTH (18-39) (m/z ) 2465.199). MS/MS spectra were externally calibrated using known fragment ion masses observed in the MS/MS spectrum of angiotensin I. Tandem mass spectra were searched against an in-house database containing the IPI human database and appended all deduced protein sequences obtained from the cDNA library of the C. rynchops venom gland using the MASCOT software. The search parameters were as follows: no restrictions on species of origin or protein molecular weight, two tryptic missed cleavages allowed, nonfixed modifications of methionine (oxidation), cysteine (carbamidomethylation), proline (hydroxy), and pyro-formation at N-terminal glutamine and glutamic acid of peptides.

Results and Discussion

Figure 2. Distribution of gene transcripts identified in the venom gland of C. rynchops. (A) Distribution of cDNAs clones in the library. (B) Distribution of genes coding for toxins.

Composition of the cDNA Library. A partial cDNA library was constructed from the venom gland tissue of C. rynchops (Figure 1). The amplified double strand cDNA showed prominent bands around 1 kb and above. The double stranded cDNAs were ligated and cloned. A total of 808 clones were picked randomly and 345 clones containing inserts of more than 200 bp were sequenced to create the cDNA library.

Out of this, 41% of clones showed similarity to toxin genes (including the new family, see below), 38% were housekeeping genes and 21% did not show matches to any known sequences (Figure 2A). Sequences belonging to putative toxins were completed and submitted to the NCBI database (accession numbers:GU065316,GU065317,GU065318,GU065319,GU065320,

1884

Journal of Proteome Research • Vol. 9, No. 4, 2010

Identification of a Novel Family of Snake Venom Proteins GU065321, GU065322, GU065323, GU065324, GU065325, and GU065326). These toxin ESTs were grouped into different families on the basis of their sequence similarities. Overall, only three families of toxins were identified in this library, which includes metalloproteinases (30%), CRISPs (22%), and lectins (22%) (Figure 2B). In addition, a new family of proteins with a signal peptide and eight cysteine residues was also encoded by 26% of the toxin clones. As described below, this group of proteins was also found in the venom proteome. Thus, C. rynchops venom appears to be one the least complex snake venom characterized to date with only four protein families and twelve protein isoforms (see below). Metalloproteinase. Snake venom metalloproteinases (SVMPs) belong to a protein family of strikingly different molecular weights and functions.11 They are evolutionarily closely related to mammalian matrix metalloproteinases and proteins of the ADAM (a disintegrin and metalloproteinase)/reprolysin subfamily. They are classified into three major groups, PI, PII, and PIII, depending on the domains included after post-translational modifications.12 The multidomain precursors undergo proteolytic cleavages leading to diverse products (For details, see refs 12 and 13). Functionally, full-length SVMPs as well as processed ‘mature’ domains cover a broad spectrum of biological activities. In C. rynchops cDNA library metalloproteinase clones represent the largest group of toxin clones. Only a partial sequence of the metalloproteinase was obtained during the initial stage of this transcriptome studies due to its large size (2.4 kb). Hence, gene specific primers were used to obtain the complete sequence of the metalloproteinase gene (see Materials and Methods). Ten randomly picked clones were sequenced completely. All clones (from the cDNA library as well as clones obtained by gene specific PCR) encoded for a single isoform of metalloproteinase. The ORF of metalloproteinase isolated from C. rynchops is 1848 nucleotides in length (GU065316). Deduced amino acid sequence of C. rynchops metalloproteinase showed 80% sequence similarity with asrin, a SVMP from Austrelaps superbus (Figure 3B). It belongs to PIII class with 615 amino acid residues. It has a pro-domain composed of 200 amino acid residues which modulate the enzymatic activity through interactions with the catalytic domain.12 The metalloproteinase domain or catalytic domain consists of nearly 200 amino acid residues (201-398). It has a Zn2+ binding motif, HELGHNLGMNHD, similar to conserved HEBxHxBGBxHD motif (where B represents bulky hydrophobic residues and x any residue) commonly found in ADAMs (a Disintegrin and Metalloprotease-homologous proteins) protein family. The disintegrin-like domain is located between residues 406 and 491. It carries a DCD motif instead of typical RGD motif found in disintegrins. The C-terminal region (residues 492-615) shows a rather high density of conserved cysteine residues (Figure 3A) consistent with the cysteine-rich domain of SVMPs. Kamiguti and co-workers have noted that four synthetic peptides based on the cysteine-rich domains of jararhagin and atrolysin A specifically inhibited platelet aggregation.14 PIII SVMPs in particular exert effects on endothelial cells that line the interior of blood vessels as well as inhibit collagen-dependent platelet aggregation.11,15 In addition, PIII SVMPs induces pro-inflammatory16 and hemorrhagic effects.17 The presence of the KCGRLFCVQSTSTV sequence in the C-terminal region of the cysteine-rich domain of C. rynchops might indicate that the protein interferes with platelet aggregation (Figure 3A).

research articles Cysteine-Rich Secretory Protein. The cysteine-rich secretory proteins (CRISPs) are widely distributed in mammals, reptiles, amphibians and secernenteans and are involved in a variety of biological functions. Acidic epididymis glycoprotein (AEG, also known as CRISP-1), the first discovered CRISP, was isolated from mammalian epididymis and granules,18 and they appear to be important for the immune system and in sperm maturation, although the actual functions of these proteins are virtually unknown.19 A variety of CRISP family proteins have been found in snake venom toxins and their molecular functions have been characterized, that is, to inhibit cyclic nucleotide-gated ion channels as well as L-type Ca2+ and BKCa K+ channels.20-22 Snake venom CRISP family proteins are widely distributed in Viperidae and Elapidae. The ORF encoding CRISP from C. rynchops is 720 nucleotides in length. Overall, five isoforms were found in the cDNA library (Figure 4B). The main cluster had 18 clones, while four isoforms were found as singleton with single nucleotide substitutions. These substitutions lead to one-residue replacements in clones 160, 337 and 349, and to premature termination in clone 197. It is not clear whether these substitutions are real or experimental artifacts of PCR.23,24 C. rynchops CRISP has 238 amino acid residues. Similar to other CRISPs, it shows two principal domains, a plant pathogenesis related domain and a cysteine rich domain (Figure 4A). C. rynchops CRISP shows a high sequence identity (92%) when compared to CRISPs from Macleay’s Water Snake (Enhydris polylepsis)25 indicating a closer phylogenetic proximity of the two water snakes. Generally, CRISPs are thought to modulate the activity of various ion channels. Some of these proteins, like ablomin and triflin, block L-type Ca2+ ion channel, while others, like tigrin, are devoid of this biological activity.20 Comparative sequence analysis revealed that CRISP from C. rynchops and triflin share highly homologous amino acid sequences particularly in the active site region. The functional residues 186ENVF189 in triflin are replaced by the NNVF sequence in C. rynchops. In addition, other hydrophobic residues like Leu195, Tyr205 and Phe215 are also well conserved among Ca2+ channel binding CRISPs (Figure 4C). These sequences are present in the concave surface, which are mainly involved in the interaction with the ion channels receptor site.26 The three-dimensional structure of C. rynchops CRISP was modeled using triflin as a template (Figure 4E). The structural similarity between these two proteins suggests that C. rynchops CRISP may also inhibit the L-type Ca2+ channel in the same manner as triflin. C-Type Lectin. A number of C-type lectins and related proteins (together grouped as snaclecs27) with various biological activities have been purified and characterized from Viperidae and Elapidae snake venoms.28 Some of them exhibit carbohydrate-binding activity (C-type lectins) and induce agglutination of erythrocytes,29 while others affect vertebrate blood coagulation and the platelet aggregation system.30 Unlike animal membrane lectins which contain additional domains, the venom lectins/related proteins consist of a single C-terminal carbohydrate-recognition domain (CRD) of animal membrane lectins.31 There were three clusters of C-type lectins in the C. rynchops cDNA library with a 477 bp nucleotide sequence. The main cluster had 18 clones (represented by clone 5), whereas the other two clusters had two clones each differing by only one amino acid residue each from the major isoform. It has 158 amino acid residues including a 23 residue signal peptide (Figure 5A). It also has the canonical CRD. Phylogenetic analysis based on the neighbor joining method revealed that the C-type Journal of Proteome Research • Vol. 9, No. 4, 2010 1885

research articles

OmPraba et al.

Figure 3. The metalloproteinase from C. rynchops. (A) Deduced amino acid sequence. The signal peptide is in bold type and the interdomain segments are underlined. Various domains of the precursor protein are identified as follows: pro-domain, light gray; metalloprotease domain, italics; disintegrin domain, dark gray; and cysteine-rich domain, boxed. The Zn2+ binding region in the metalloprotease domain is shown in gray background. The conserved loop (shown in black letters) is also found in ADAMs. (B) Multiple alignment of the metalloproteinase from C. rynchops with homologous metalloproteinases.

lectin from C. rynchops is closer to various isoforms isolated from the water snake E. polylepis. It shows 65-95% sequence identity with various (Enh1-Enh7) isoforms of C-type lectin from E. polylepis and 79% sequence identity with the C-type lectin from the Elapid snake Bungarus multicinctus, respectively. It has a conserved calcium-binding region and triplet sequence “EPN” indicating mannose specificity.32 All snaclecs form oligomeric quaternary structures. In general, they are disulfide-linked homo- or heterodimers.33 In C-type lectin-related proteins, the heterodimers are formed by domain swapping,34 whereas in C-type lectins, such domain 1886

Journal of Proteome Research • Vol. 9, No. 4, 2010

swaps do not seem to occur. Further, the subunits are held together by a single interchain disulfide bond. To date, only rhodocetin from Protobothrops (formerly, Trimeresurus) purpureomaculatus venom is a heterodimeric C-type lectin-related protein in which the two subunits are held together by noncovalent interactions.35,36 By analyses of sequences, we proposed the possible reasons for domain swapping in C-type lectin-related proteins.28 In these proteins, the interaction region is linked to the core of the proteins through a shorter ‘arm’, whereas in C-type lectins, the interaction region is linked through a longer arm to enable it to fold back (for details, see

Identification of a Novel Family of Snake Venom Proteins

research articles

Figure 4. Cysteine-rich secretory proteins from C. rynchops. (A) Deduced amino acid sequence of the major isoform (clone 49). The signal peptide is underlined, while the plant pathogenesis related domain is boxed and the cysteine rich domain is shaded in black. (B) Different isoforms of the CRISP. (C) Multiple alignment of C. rynchops CRISP with homologous proteins. The region marked in gray background represents the functional residues and the residues in black shade refer to the binding regions. Residues in gray background and italics indicate the interdomain regions. (D) Molecular model of C. rynchops CRISP. (E) Crystal structure of Triflin a CRISP from Trimeresurus flavoviridis (PDB ID: 1WVR) that blocks L-type calcium channel. The functional residues present in the concave region of the proteins, which are important for interactions with target molecules, are shown as stick model. See text for details. Journal of Proteome Research • Vol. 9, No. 4, 2010 1887

research articles

OmPraba et al.

Figure 5. C-type lectin from C. rynchops. (A) Deduced amino acid sequence of the major isoform (clone 5). The signal peptide is underlined and the putative mannose-binding region is shaded in black. (B) Different isoforms of the C-type lectin from C. rynchops. (C) Multiple alignment of the C-type lectin from C. rynchops with homologous C-type lectin proteins. The signal sequence is underlined and the mannose-specific “EPN” as well as the galactose-binding “QPD” regions are marked in dark background. The residues “E” and “ND” labeled in gray shade indicate the Ca2+ binding region. D) Molecular model of C. rynchops C-type lectin. E) Superimposition of structures of C-type lectins from rattle snake (Crotalus atrox) (blue) and C. rynchops (red). The binding sites for galactose and Ca2+ are shown.

ref 28). So far, all described C-type lectins in the literature are homodimers. Since we found only one major isoform and other isoforms differ by only one amino acid residue, we speculate that the C. rynchops lectin is most likely a C-type lectin. The three-dimensional structure of C. rynchops lectin was predicted based on the crystal structure of rattle snake C-type lectin (RSL, C. atrox, PDB id: 1JZNa) (Figure 5D). The rmsd between these two model structures is 0.322 Å based on its backbone (Figure 5E). The model reveals three intrachain disulfide bonds between C3-C14, C31-C131 and C106-C123, 1888

Journal of Proteome Research • Vol. 9, No. 4, 2010

respectively. When compared to RSL, it lacks one disulfide bond (C38-C133) and these cysteine residues are substituted by C38G and C133Y, respectively. In addition, most critical C86 which is involved in interchain disulfide bond is replaced by serine. Further, the C. rynchops protein has a longer arm, and hence, domain swapping also may not occur. We therefore speculate that the C. rynchops lectin remains as a monomer, similar to maltose-binding protein. CRDs in C-type lectins have a rather weak affinity constant (in the low millimolar range) for small carbohydrates. Oligo-

research articles

Identification of a Novel Family of Snake Venom Proteins merization appears to be an evolutionary dodge to overcome this shortcoming as well as to provide multivalency. At least bivalency is required for the lectin function. Therefore, despite the high structural similarity with RSL, the C. rynchops protein may not be able to cross-link the erythrocytes and platelets and hence may not induce both erythrocyte agglutination and platelet aggregation. New Family of Toxins: Veficolins. During the analyses of cDNA sequences, we found that 26% of clones could encode proteins with a signal peptide and eight cysteine residues. These proteins showed significant similarity to ficolins (50%) and were identified for the first time in snake venom transcriptome. Therefore, we named this family as veficolins (venom ficolins) and proteins isolated from C. rynchops venom as ryncolins (rynchops ficolin). Two major isoforms of ryncolins, namely, ryncolin 1 (represented by clone 222) and ryncolin 2 (represented by clone 289) were encountered with a frequency of 14 and 11 times, respectively (Figure 6B). To detect and identify possible isoforms of ryncolin, we used gene specific primers to amplify all related genes. Out of 60 random clones, we obtained complete sequence with significant overlap in the case of 45 clones using T 7 and SP6 primers. These clones coded for the same two major isoforms (23 clones similar to clone 222 and 20 clones similar to 289) and two singletons. The sequences from other 15 clones did not have sufficient overlap to complete the sequences. The complete coding region of ryncolin encodes for 347 (ryncolin 1) or 345 (ryncolin 2) amino acid residues. Ryncolin 1 and 2 share 79% identity in mature protein region (Figure 6B). Both have all eight conserved cysteine residues. Mature ryncolin 1 has an additional cysteine at position 37. They show 55% and 53% sequence identity with Monodelphis domestica (Gray short-tailed opossum) and human M-ficolin, respectively. Ficolins are a group of oligomeric proteins found in different types of tissues like lung, liver, heart, spleen, and peripheral blood leukocytes.37 They show a collagen-like domain in the N-terminal region and a C-terminal fibrinogen-like domain and hence their name.38 Structurally, ficolins are believed to form dodecamers where the collagen-like domains form trimers and the short N-terminal regions assemble four of these trimers into the final complex by interchain disulfide formation. Ryncolins have a stretch of 63 amino acid residues (G-X-X repeats similar to collagen) (Figure 6A) close to the N-terminus. Their C-terminal segment (from 117-345 residues) is similar to the C-terminal globular domain of fibrinogen as evidenced by significant homology (ryncolin 1, E-value 1e-75; ryncolin 2, E-value 3e-76) to the fibrinogen-related domains (FreDs) including the calcium binding site, the polymerization pocket, and the γ-γ dimer interface. Functionally, ficolins are related to the innate defense system where they initiate the lectin pathway of the complement system.39 Ryncolins may also be involved in complement activation. Alternatively, they may also interfere in platelet aggregation and/or blood coagulation. Collagen is a potent inducer of platelet aggregation. Ryncolins may induce platelet aggregation in the prey due the presence of N-terminal collagen-like domain. Interestingly, a potent platelet aggregation inducer called trimucytin from Taiwan Habu snake (Trimeresurus mucrosquamatus) also has collagen-like GPX-repeats at the N-terminus.40 Unfortunately, the detailed sequence of trimucytin has not been published. Fibrinogen is involved in blood clotting and activated by thrombin to assemble into fibrin clots. The C-terminal globular domains of the γ chains

41

(C-γ) dimerize and bind to the GPR motif of the N-terminal domain of the R chain, while the GHR motif of N-terminal domain of the β chain binds to the C-terminal globular domains of another β chain (C-β),42 which finally leads to lattice formation. Thus dimerization of the globular domains of two γ subunits is important in fibrin fiber formation. The presence of the γ-γ interface site, the polymerization pocket and the calcium binding domain (Figure 6A) indicates that ryncolins could mimic the C-terminal globular domain of fibrinogen and might interfere in fibrin formation. Proteome of the Venom Gland of C. rynchops. To confirm the presence of ryncolins and to search for other toxins, we used a proteomic approach on the C. rynchops venom. Tandem mass spectral data of tryptic peptide fractions were searched against an in-house database containing the IPI human database and all deduced protein sequences obtained from the cDNA library of the C. rynchops venom gland. We were able to identify proteins from all four families described above. However, no other toxins from other families were identified. Sequence coverage using proteomics approach for various families of toxins ranged from 39% (metalloproteinase) to 61% (CRISP). The lower sequence coverage of metalloproteinase is probably due to the inclusion of the prodomain (residues 1-192, Figure 3A) for the calculation of the coverage. In the proteome, however, we were not able to detect any peptide sequence covering this specific domain (Figure 7). Either the peptides related to this domain are absent in the venom or their ionization is suppressed in MALDI. It is, however, important to note that, despite fairly significant number of efforts, there are only few reports,43,44 where sequences related to the prodomain of SVMPs were identified in the venom. It is possible that this prodomain is proteolytically cleaved before its secretion. In such situation, the sequence coverage of the mature metalloproteinase would increase to 52%. In the case of CRISP and the C-type lectin, the sequence coverage at their respective N-terminal regions is excellent but the C-terminal regions have rather little coverage (Figure 7). Closer inspection of the amino acid sequences of these proteins (Figures 4A and 5A) indicates that tryptic peptides in these regions are too small (>10 residues) or too large (95 and 33 residue fragment in CRISP and C-type lectin, respectively). The lack of optimal size of tryptic fragments explains poorer coverage at their C-terminal regions. The translated cDNA sequence of ryncolins (Figure 6A) points to an N-terminal collagen-like domain of the protein. Human collagen, for example, is characterized by extensive hydroxyproline content as a post-translational modification.45 Therefore, we included hydroxylation of proline residues as one of the post-translational modifications during the database search. We found hydroxyproline (GDPGPQGLPGETGFDGIPGVAGPK, ryncolin 1; and GDPGPQGPPGIR, ryncolin 2, Supplementary Table 1) in the veficolin sequence consistent with the notion of a collagen-like domain of this protein. As expected based on the results of the transcriptomic studies, we identified one major isoform each of the metalloproteinase, CRISP, and C-type lectin and two major isoforms of ryncolins. We were not able to identify proteins coded by singletons of CRISP or ryncolins, or two isoforms of C-type lectin encoded by two clones each. These are either artifacts of PCR or sequencing reactions or these proteins are present in very low quantities in the C. rynchops venom. We could not identify any other proteins of other known toxin families or other nontoxic proteins from the tandem mass analyses of tryptic peptides. Thus, C. rynchops appears to contain a total Journal of Proteome Research • Vol. 9, No. 4, 2010 1889

research articles

OmPraba et al.

Figure 6. Veficolins from C. rynchops. (A) Deduced amino acid sequence of ryncolin 1. The signal peptide is underlined and the collagenlike domain is shaded in black. The sequence in gray background refers to the γ-γ-dimer interface region. The calcium binding region is boxed, while the polymerization pocket is shown in bold. (B) Different isoforms of ryncolins from C. rynchops. (C) Multiple alignment of ryncolins with homologous proteins. The γ-γ-dimer interface (dark gray background), the calcium binding region (light gray background), and the polymerization site (dark background) are indicated. 1890

Journal of Proteome Research • Vol. 9, No. 4, 2010

Identification of a Novel Family of Snake Venom Proteins

research articles

Figure 7. Sequence coverage of the proteins identified in the venom of C. rynchops. In each panel, domain structure of proteins are shown at the top. Black bars refer to the part of the sequence covered by tandem mass spectrometry, while blue bars indicate individual tryptic peptides of the corresponding proteins. In addition, red bars indicate tryptic fragments that are too short to be detected by the experimental approach and green bars refer to tryptic fragments that are larger than 3800 Da.

of five major proteins, one isoform each of metalloproteinase, CRISP and C-type lectin and two major isoforms of ryncolins. Hence, it represents a very low complexity snake venom.

Conclusions Colubridae is a paraphyletic family and Homalopsidae is generally recognized as a monophyletic clade within Colubridae.46-48 The Homalopsidae, a basal taxa, may represent

an adaptive radiation that occurred during the early evolution of colubroid snakes in Southeast Asia.49 This is the first detailed description of venom composition of a snake from the Homalopsidae family. The transcriptomic and proteomic studies of the venom composition of C. rynchops indicate that its venom contains five proteins belonging to four protein families. Among them, two proteins, ryncolin 1 and ryncolin 2 belong to a new family of venom proteins (veficolins). We predict that their Journal of Proteome Research • Vol. 9, No. 4, 2010 1891

research articles activity will be to induce platelet aggregation and/or initiate complement activation. Experiments to verify this assumption are currently underway.

Supporting Information Available: Supplementary Table 1, sequences of the proteins of the venom gland of C. rynchops identified by tandem mass spectrometry. Supplementary Figure 1, gene specific amplifications of (A) Metalloproteinase and (B) Ryncolins. Supplementary Figure 2, tandem mass spectra of ryncolin 1 and ryncolin 2. This material is available free of charge via the Internet at http://pubs.acs.org. Acknowledgment. This work was supported by a grant to R. M. Kini from the Biomedical Research Council, Agency for Science, Technology and Research, Singapore. References (1) Lukas, R. J.; Morimoto, H.; Hanley, M. R.; Bennett, E. L. Radiolabeled alpha-bungarotoxin derivatives: kinetic interaction with nicotinic acetylcholine receptors. Biochemistry 1981, 20 (26), 7373– 7378. (2) Servent, D.; Winckler-Dietrich, V.; Hu, H. Y.; Kessler, P.; Drevet, P.; Bertrand, D.; Menez, A. Only snake curaremimetic toxins with a fifth disulfide bond have high affinity for the neuronal alpha7 nicotinic receptor. J. Biol. Chem. 1997, 272 (39), 24279–24286. (3) Vogel, C. W. The complement-activating protein of cobra venom. In Handbook of Natural Toxins, Tu, A. T., Ed.; Marcel Dekker Ltd.: New York, 1991; pp147-188. (4) Menez, A.; Gillet, D.; Grishin, E. Toxins: threats and benefits. In Recent Research Developments on Toxins from Bacteria and Other Organisms; Gillet, D.; Johannes, L. , Eds.; Research Signpost: Trivandrum, India, 2005. (5) Ferreira, S. H.; Vane, J. R. The disappearance of bradykinin and eledoisin in the circulation and vascular beds of the cat. Br. J. Pharmacol. Chemother 1967, 30 (2), 417–424. (6) Opie, L. H.; Kowolik, H. The discovery of captopril: from large animals to small molecules. Cardiovasc. Res. 1995, 30 (1), 18–25. (7) Cox, M. J.; van Dijk, P. P.; Nabhitabhata, J.; Thirakhupt, K. A Photographic Guide to Snakes and Other Reptiles of Peninsular Malaysia; New Holland Publishers Ltd.; Singapore and Thailand, 1998. (8) Lawson, R.; Slowinski, J. B.; Crother, B. I.; Burbrink, F. T. Phylogeny of the Colubroidea (Serpentes): new evidence from mitochondrial and nuclear genes. Mol. Phylogenet. Evol. 2005, 37 (2), 581–601. (9) Vidal, N.; Delmas, A. S.; David, P.; Cruaud, C.; Couloux, A.; Hedges, S. B. The phylogeny and classification of caenophidian snakes inferred from seven nuclear protein-coding genes. C. R. Biol. 2007, 330 (2), 182–187. (10) Fry, B. G.; Scheib, H.; van der Weerd, L.; Young, B.; McNaughtan, J.; Ramjan, S. F.; Vidal, N.; Poelmann, R. E.; Norman, J. A. Evolution of an arsenal: structural and functional diversification of the venom system in the advanced snakes (Caenophidia). Mol. Cell. Proteomics 2008, 7 (2), 215–246. (11) Moura-da-Silva, A. M.; Butera, D.; Tanjoni, I. Importance of snake venom metalloproteinases in cell biology: effects on platelets, inflammatory and endothelial cells. Curr. Pharm. Des. 2007, 13 (28), 2893–2905. (12) Fox, J. W.; Serrano, S. M. Insights into and speculations about snake venom metalloproteinase (SVMP) synthesis, folding and disulfide bond formation and their contribution to venom complexity. FEBS J. 2008, 275 (12), 3016–3030. (13) Kini, R. M.; Evans, H. J. Structural domains in venom proteins: evidence that metalloproteinases and nonenzymatic platelet aggregation inhibitors (disintegrins) from snake venoms are derived by proteolysis from a common precursor. Toxicon 1992, 30 (3), 265–293. (14) Kamiguti, A. S.; Gallagher, P.; Marcinkiewicz, C.; Theakston, R. D.; Zuzel, M.; Fox, J. W. Identification of sites in the cysteine-rich domain of the class P-III snake venom metalloproteinases responsible for inhibition of platelet function. FEBS Lett. 2003, 549 (13), 129–134. (15) Moura-da-Silva, A. M.; Ramos, O. H.; Baldo, C.; Niland, S.; Hansen, U.; Ventura, J. S.; Furlan, S.; Butera, D.; Della-Casa, M. S.; Tanjoni, I.; Clissa, P. B.; Fernandes, I.; Chudzinski-Tavassi, A. M.; Eble, J. A. Collagen binding is a key factor for the hemorrhagic activity of snake venom metalloproteinases. Biochimie 2008, 90 (3), 484–492.

1892

Journal of Proteome Research • Vol. 9, No. 4, 2010

OmPraba et al. (16) Costa, E. P.; Clissa, P. B.; Teixeira, C. F.; Moura-da-Silva, A. M. Importance of metalloproteinases and macrophages in viper snake envenomation-induced local inflammation. Inflammation 2002, 26 (1), 13–17. (17) Laing, G. D.; Moura-da-Silva, A. M. Jararhagin and its multiple effects on hemostasis. Toxicon 2005, 45 (8), 987–996. (18) Kierszenbaum, A. L.; Lea, O.; Petrusz, P.; French, F. S.; Tres, L. L. Isolation, culture, and immunocytochemical characterization of epididymal epithelial cells from pubertal and adult rats. Proc. Natl. Acad. Sci. U.S.A. 1981, 78 (3), 1675–9. (19) Yamazaki, Y.; Morita, T. Structure and function of snake venom cysteine-rich secretory proteins. Toxicon 2004, 44 (3), 227–231. (20) Yamazaki, Y.; Koike, H.; Sugiyama, Y.; Motoyoshi, K.; Wada, T.; Hishinuma, S.; Mita, M.; Morita, T. Cloning and characterization of novel snake venom proteins that block smooth muscle contraction. Eur. J. Biochem. 2002, 269 (11), 2708–2715. (21) Nobile, M.; Magnelli, V.; Lagostena, L.; Mochca-Morales, J.; Possani, L. D.; Prestipino, G. The toxin helothermine affects potassium currents in newborn rat cerebellar granule cells. J. Membr. Biol. 1994, 139 (1), 49–55. (22) Nobile, M.; Noceti, F.; Prestipino, G.; Possani, L. D. Helothermine, a lizard venom toxin, inhibits calcium current in cerebellar granules. Exp. Brain Res. 1996, 110 (1), 15–20. (23) Roberts, L. Finding DNA sequencing errors. Science 1991, 252 (5010), 1255–1256. (24) States, D. J.; Botstein, D. Molecular sequence accuracy and the analysis of protein coding regions. Proc. Natl. Acad. Sci. U.S.A. 1991, 88 (13), 5518–5522. (25) Fry, B. G.; Vidal, N.; Norman, J. A.; Vonk, F. J.; Scheib, H.; Ramjan, S. F.; Kuruppu, S.; Fung, K.; Hedges, S. B.; Richardson, M. K.; Hodgson, W. C.; Ignjatovic, V.; Summerhayes, R.; Kochva, E. Early evolution of the venom system in lizards and snakes. Nature 2006, 439 (7076), 584–588. (26) Shikamoto, Y.; Suto, K.; Yamazaki, Y.; Morita, T.; Mizuno, H. Crystal structure of a CRISP family Ca2+ -channel blocker derived from snake venom. J. Mol. Biol. 2005, 350 (4), 735–743. (27) Clemetson, K. J.; Morita, T.; Kini, R. M. Classification and nomenclature of snake venom C-type lectins and related proteins. Toxicon 2009, 54 (1), 83. (28) Doley, R.; Kini, R. M. Protein complexes in snake venom. Cell. Mol. Life Sci. 2009, 66 (17), 2851–2871. (29) Gartner, T. K.; Agin, P. P. Plasma fibronectin binds glucose. Biochem. Biophys. Res. Commun. 1980, 96 (4), 1747–1754. (30) Lu, Q.; Navdaev, A.; Clemetson, J. M.; Clemetson, K. J. Snake venom C-type lectins interacting with platelet receptors. Structurefunction relationships and effects on haemostasis. Toxicon 2005, 45 (8), 1089–1098. (31) Giga, Y.; Ikai, A.; Takahashi, K. The complete amino acid sequence of echinoidin, a lectin from the coelomic fluid of the sea urchin Anthocidaris crassispina. Homologies with mammalian and insect lectins. J. Biol. Chem. 1987, 262 (13), 6197–6203. (32) Drickamer, K. Two distinct classes of carbohydrate-recognition domains in animal lectins. J. Biol. Chem. 1988, 263 (20), 9557– 9560. (33) Hirabayashi, J.; Kusunoki, T.; Kasai, K. Complete primary structure of a galactose-specific lectin from the venom of the rattlesnake Crotalus atrox. Homologies with Ca2(+)-dependent-type lectins. J. Biol. Chem. 1991, 266 (4), 2320–2326. (34) Mizuno, H.; Fujimoto, Z.; Koizumi, M.; Kano, H.; Atoda, H.; Morita, T. Structure of coagulation factors IX/X-binding protein, a heterodimer of C-type lectin domains. Nat. Struct. Biol. 1997, 4 (6), 438–441. (35) Wang, R.; Kini, R. M.; Chung, M. C. Rhodocetin, a novel platelet aggregation inhibitor from the venom of Calloselasma rhodostoma (Malayan pit viper): synergistic and noncovalent interaction between its subunits. Biochemistry 1999, 38 (23), 7584–7593. (36) Paaventhan, P.; Kong, C.; Joseph, J. S.; Chung, M. C.; Kolatkar, P. R. Structure of rhodocetin reveals noncovalently bound heterodimer interface. Protein Sci. 2005, 14 (1), 169–175. (37) Kakinuma, Y.; Endo, Y.; Takahashi, M.; Nakata, M.; Matsushita, M.; Takenoshita, S.; Fujita, T. Molecular cloning and characterization of novel ficolins from Xenopus laevis. Immunogenetics 2003, 55 (1), 29–37. (38) Ichijo, H.; Hellman, U.; Wernstedt, C.; Gonez, L. J.; Claesson-Welsh, L.; Heldin, C. H.; Miyazono, K. Molecular cloning and characterization of ficolin, a multimeric protein with fibrinogen- and collagen-like domains. J. Biol. Chem. 1993, 268 (19), 14505–14513. (39) Runza, V. L.; Schwaeble, W.; Mannel, D. N. Ficolins: novel pattern recognition molecules of the innate immune response. Immunobiology 2008, 213 (3-4), 297–306.

research articles

Identification of a Novel Family of Snake Venom Proteins (40) Teng, C. M.; Ko, F. N.; Tsai, I. H.; Hung, M. L.; Huang, T. F. Trimucytin: a collagen-like aggregating inducer isolated from Trimeresurus mucrosquamatus snake venom. Thromb. Haemost. 1993, 69 (3), 286–292. (41) Yee, V. C.; Pratt, K. P.; Cote, H. C.; Trong, I. L.; Chung, D. W.; Davie, E. W.; Stenkamp, R. E.; Teller, D. C. Crystal structure of a 30 kDa C-terminal fragment from the gamma chain of human fibrinogen. Structure 1997, 5 (1), 125–138. (42) Spraggon, G.; Everse, S. J.; Doolittle, R. F. Crystal structures of fragment D from human fibrinogen and its crosslinked counterpart from fibrin. Nature 1997, 389 (6650), 455–462. (43) Cominetti, M. R.; Ribeiro, J. U.; Fox, J. W.; Selistre-de-Araujo, H. S. BaG, a new dimeric metalloproteinase/disintegrin from the Bothrops alternatus snake venom that interacts with alpha5beta1 integrin. Arch. Biochem. Biophys. 2003, 416 (2), 171–179. (44) Valente, R. H.; Guimaraes, P. R.; Junqueira, M.; Neves-Ferreira, A. G.; Soares, M. R.; Chapeaurouge, A.; Trugilho, M. R.; Leon, I. R.; Rocha, S. L.; Oliveira-Carvalho, A. L.; Wermelinger, L. S.; Dutra, D. L.; Leao, L. I.; Junqueira-de-Azevedo, I. L.; Ho, P. L.; Zingali,

(45) (46) (47) (48) (49)

R. B.; Perales, J.; Domont, G. B. Bothrops insularis venomics: A proteomic analysis supported by transcriptomic-generated sequence data. J. Proteomics 2009, 72 (2), 241–55. Hulmes, D. J. The collagen superfamily--diverse structures and assemblies. Essays Biochem. 1992, 27, 49–67. Greene, H. W. Snakes: The Evolution of Mystery in Nature; University ofCalifornia Press: Berkeley, CA, 1997. Pough, F. H.; Andrews, R. M.; Cadle, J. E.; Crump, M. L.; Savitsky, A. H.; Wells, K. D. Herpetology., 3rd ed.; Benjamin Cummings: Upper Saddle River, NJ, 2003. Zug, G. R.; Vitt, L. J.; Caldwell, J. P. Herpetology: An Introductory Biology of Amphibians and Reptiles, 2nd ed.; Academic Press: San Diego, CA, 2001. Voris, H. K.; Alfaro, M. E.; Karns, D. R.; Starnes, G. L.; Thompson, E.; Murphy, J. C. Phylogenetic relationships of the orientalaustralian rear-fanged water snakes (Colubridae: Homalopsinae) based on mitochondrial DNA sequences. Copeia 2002, (4), 906–915.

PR901044X

Journal of Proteome Research • Vol. 9, No. 4, 2010 1893