Molecular Diversification of Peptide Toxins from the Tarantula

Mar 2, 2010 - The tarantula Haplopelma hainanum (Ornithoctonus hainana) is a ... throughput identification of tarantula-venom peptides from H. hainanu...
0 downloads 0 Views 11MB Size
Molecular Diversification of Peptide Toxins from the Tarantula Haplopelma hainanum (Ornithoctonus hainana) Venom Based on Transcriptomic, Peptidomic, and Genomic Analyses Xing Tang,# Yongqun Zhang,# Weijun Hu, Dehong Xu, Huai Tao, Xiaoxu Yang, Yan Li, Liping Jiang, and Songping Liang* The Key Laboratory of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha 410081, China Received January 1, 2010

The tarantula Haplopelma hainanum (Ornithoctonus hainana) is a very venomous spider found widely in the hilly areas of Hainan province in southern China. Its venom contains a variety of toxic components with different pharmacological properties. In the present study, we used a venomic strategy for highthroughput identification of tarantula-venom peptides from H. hainanum. This strategy includes three different approaches: (i) transcriptomics, that is, EST-based cloning and PCR-based cloning plus DNA sequencing; (ii) peptidomics, that is, off-line multiple dimensional liquid chromatography coupled with mass spectrometry (MDLC-MS) plus peptide sequencing (direct Edman sequencing and bottom-up mass spectrometric sequencing); (iii) genomics, that is, genomic DNA cloning plus DNA sequencing. About 420 peptide toxins were detected by mass spectrometry, and 272 peptide precursors were deduced from cDNA and genomic DNA sequences. After redundancy removal, 192 mature sequences were identified by three approaches. This is the largest number of peptide toxin sequences identified from a spider species so far. On the basis of precursor sequence identity, peptide toxins from the tarantula H. hainanum venom can be classified into 11 superfamilies (and related families). Our results revealed that gene duplication and focal hypermutation may be responsible for the enormous molecular diversity in spider peptide toxins. The current work is an initial overview for the study of tarantula-venom peptides in parallel transcriptomic, peptidomic, and genomic analyses. It is hoped that this work will also provide an effective guide for high-throughput identification of peptide toxins from other spider species, especially tarantula species. Keywords: Haplopelma hainanum • venomic strategy • high-throughput • tarantula-venom • transcriptomics • peptidomics • genomics • superfamilies • gene duplication • hypermutation • molecular diversity

Introduction Spider venoms are complex mixtures of low molecular mass organic molecules (10 kDa).1 A vast majority of spider toxins are polypeptide toxins with 3-5 disulfide bonds which display various structures and biological activities. Spider peptide toxins have proved to be a powerful tool for the study of voltage-sensitive and ligand-gated ion channels and have potential applications as novel pharmaceutical drugs.2-4 The largest spiders (tarantulas, family Theraphosidae) belong to the mygalomorphs, and they possess two fangs that inject venom into prey tissues.3-6 Venoms from tarantulas are heterogeneous, and the specific composition of these venoms varies significantly from species to species. Previous studies reveal that tarantula venoms contain peptide toxins as major constituents. By 2009, 928 tarantula species (in 116 genera) had been reported (http://research.amnh.org/iz/spiders/catalog/ * Corresponding author: Professor Songping Liang. Tel: +86-731-88872556. Fax: +86-731-88861304. E-mail: [email protected]. # These authors have the same contribution to this paper.

2550 Journal of Proteome Research 2010, 9, 2550–2564 Published on Web 03/02/2010

COUNTS.html), and based on a conservative estimate of ca. 50 peptides per venom,6 it has been calculated that there are approximately 46 400 molecules from the entire Theraphosidae group. Tarantula venoms provide a good model to study toxin selectivity, structure-activity relationships, and the molecular evolution of peptide toxins.6 However, to date, only ca. 200 tarantula-venom peptides in total have been reported (http:// www.arachnoserver.org/mainMenu.html)7 and only 5 peptide profiles from different tarantula species have been performed.8-14 Indeed, just the venoms of the Chinese tarantulas Chilobrachys guangxiensis (Chilobrachys jingzhao) and Haplopelma schmidti (Ornithoctonus huwena) have been systematically investigated by high-throughput methods for peptide toxins identification.9-12,15 The Chinese tarantula Haplopelma hainanum, which is similar to the spider H. schmidti in morphology, is a venomous spider distributed in the hilly areas of Hainan province in southern China.16 In our previous work, several neurotoxins which are the major components of the venom from H. hainanum have been purified and characterized.17-22 These 10.1021/pr1000016

 2010 American Chemical Society

Molecular Diversification of Peptide Toxins from Tarantula results provide a tantalizing glimpse of pharmaceutical components in this venom and stir up new interest in systematic research on venom peptides. In this study, we describe the large-scale identification and analysis of peptide toxins from H. hainanum venom using a venomic strategy that includes the following three different approaches: (i) transcriptomics, namely, EST-based cloning and PCR-based cloning plus DNA sequencing; (ii) peptidomics, namely, off-line MDLC-MS plus peptide sequencing (direct Edman sequencing and bottomup sequencing); and (iii) genomics, namely, genomic DNA cloning plus DNA sequencing. About 420 peptide toxins were detected by mass spectrometry, 272 peptide precursors were deduced from cDNA and genomic DNA sequences, and 192 nonredundant mature sequences were identified by three approaches. On the basis of precursor sequence identity, peptide toxins from the tarantula H. hainanum venom can be classified into 11 superfamilies (and related families). These results also show the enormous molecular diversity in peptide toxins from the tarantula venom.

Materials and Methods Materials. Trizol Reagent was purchased from Invitrogen. The Creator SMART cDNA Library Construction Kit and PMD18-T vector were from Takara. Sephadex G-75 and Coomassie Blue dye were purchased from Amersham Pharmacia-Biotech (Uppsala, Sweden). Dithiothreitol (DTT), iodoacetamide, and trifluoroacetic acid (TFA) were obtained from Sigma (St. Louis, MO). The Wizard SV Genomic DNA purification system and trypsin were from Promega. Tricine and SDS were purchased from Amresco (Solon, OH). Acetonitrile (ACN) was a domestic product (chromatogram grade). Deionized water was prepared with a tandem Milli-Q system and used for the preparation of all buffers. Other chemicals were analytical grade. Adult female H. hainanum spiders were collected in Hainan province of China. cDNA Library Construction, DNA Sequencing, and Sequence Analysis. The venom glands of 20 adult female spiders were dissected and immediately homogenized in liquid nitrogen. The total RNA was extracted with Trizol Reagent and 1.0 µg total RNA was used for library construction. The full-length cDNA library was made following the instructions for the Creator SMART cDNA Library Construction Kit. Colony PCR was performed with the M13 forward and reverse primers to rapidly screen recombinant clones. The PCR products were resolved by agarose gel electrophoresis to determine the size of each product. Then, 1049 clones with the cDNA insert >400 bp were sequenced using an automated ABI PRISM 3700 sequencer (Perkin-Elmer). The clustering and assembling of ESTs were carried out with a universal tool of the EGassembler (http://egassembler.hgc.jp/). cDNA sequences were translated and processed by the MEGA software.23 According to sequence identity, peptide toxin precursors can be classified into different superfamilies (and related families). Screening of cDNA. Specific primer pairs were designed according to the sequence of 5′ and 3′ untranslated regions (UTRs) in toxin superfamilies (and related families) from cDNA library of H. hainanum venom glands. The sequences of specific primer pairs were listed in Supplementary Table S1. The cDNA from venom gland cDNA library of H. hainanum was used as template in PCR reaction. The DNA polymerase was Advantage 2 Polymerase Mix from Clontech. The PCR conditions were: preincubation for 94 °C for 5 min, followed by 30 cycles of 1 min at 94 °C, 1 min at 55 °C and 1 min at

research articles 72 °C, and then a final extension step of 72 °C for 10 min. The resulting PCR fragments were purified and ligated into PMD18-T vector. After transformation into Escherichia coli DH10B, 50 clones per superfamily were sequenced using an automated ABI PRISM 3700 sequencer. Venom Sample Preparation. The venom from adult female H. hainanum spiders was collected by electrical milking as described in our laboratory earlier24 and immediately freezedried. Gel Filtration. Venom sample (total 420 mg) of H. hainanum was solubilized in 10 mL of 50 mM NH4HCO3 (pH 6.8) and then an aliquot of 42 mg (a total of 10 aliquots) was loaded onto a Sephadex G-75 column (10 × 600 mm) pre-equilibrated with 50 mM NH4HCO3 (pH 6.8). Elution of venom was carried out using the same buffer with a flow rate of 1.0 mL/min at room temperature (25 °C). The eluate was monitored at 215 nm. The eluted peaks were further analyzed by Tricine-SDS-PAGE to check the molecular weight (MW) range. After electrophoresis, the gels were stained with Coomassie Brilliant Blue G-250. On the basis of the results from Tricine-SDS-PAGE, the peak containing peptides with MW less than 10 kDa was pooled for HPLC separation. Separation of Venom Peptides by HPLC. The peak containing peptides (MW < 10 kDa) was loaded onto a selfassembled column (Acell plus CM cation-exchange media, 10 × 200 mm) initially equilibrated with 0.02 M sodium phosphate buffer (pH 6.25). The column was eluted using a gradient of 0-75% of 1.0 M sodium chloride (pH 6.25) over 50 min at a constant flow rate of 2.5 mL/min. The fractions eluted from cation-exchange HPLC then were applied to an analytical Phenomenex C18 reverse phase HPLC (RP-HPLC) column (100 Å, 250 × 4.6 mm) and eluted at a flow rate of 1.0 mL/min using a gradient 5-20% buffer B (0.1% (v/v) TFA in ACN) over 10 min, followed by a gradient of 20%-45% buffer B over 35 min (Buffer A was 0.1% (v/v) TFA in water). The effluents were monitored at 215 nm for both cationexchange HPLC and RP-HPLC. MALDI-TOF-TOF MS Analysis. The peptide elution from the first RP-HPLC separation was collected and analyzed by MALDI-TOF-TOF mass spectrometry (UltraFlex I, Bruker Daltonics). A 1 µL aliquot of each peptide elution was spotted onto a 384-well target plate along with an equal volume of a matrix solution containing 20 mg/mL R-cyano-4-hydroxycinnamic acid (CHCA), 50% ACN, and 0.1% TFA. The mixture was allowed to dry at room temperature. Calibration of the instrument was performed externally with a peptide calibration standard II (Bruker, Germany). Mass spectrometry was performed using an acceleration voltage of 25 kV. Edman Sequencing. After multiple RP-HPLC separation, native peptides with a purity of more than 90% (on the basis of HPLC and MS analysis) were selected and submitted to automated N-terminal sequencing on a Procise 491A protein sequencer (Applied Biosystems). Bottom-up Sequencing. Native peptides with high purity were also sequenced by MS/MS. About 1 pmol of freeze-dried samples were dissolved in 10 µL of 25 mM NH4HCO3 and then boiled for 5 min. The sample solutions were reduced with 1 µL of 0.1 M DTT in 25 mM NH4HCO3 for 1 h at 57 °C and then alkylated with 1 µL of 0.55 M iodoacetamide in 25 mM NH4HCO3 in the dark for 45 min at room temperature, followed by digestion with 1 µL of trypsin (0.4 µg/µL) for 18 h at 37 °C. The digestions were stopped by acidification and freeze-dried. Tryptic peptides were then redissolved with 3-5 µL of 0.1% Journal of Proteome Research • Vol. 9, No. 5, 2010 2551

research articles

Tang et al.

Figure 1. Schematic diagram of the combination strategy for high-throughput identification of peptide toxins from H. hainanum venom.

(v/v) TFA in water. The digested peptides were analyzed by an UltraFlex I MALDI-TOF-TOF mass spectrometer operated in the reflector with a fully automated mode. The spotting process is the same as described above in the section MALDI-TOFTOF MS Analysis. Calibration of the instrument was performed externally with a peptide calibration standard II. An accelerating voltage of 25 kV was used for peptide mass fingerprinting (PMF). The peaks with S/N g 5 and resolution g2500 were selected and used for LIFT-TOF-TOF MS/MS from the same target. LIFT spectra were interpreted manually, or using BioTools 3.0 from Bruker Daltonics. Preparation of Total Genomic DNA and PCR Amplification. Total genomic DNA was extracted from the muscle tissue of adult female H. hainanum spiders by the Wizard SV Genomic DNA purification system and then used as template for PCR. Specific primer pairs for screening of cDNA (Supplementary Table S1) and Advantage 2 Polymerase Mix were also used in genomic DNA PCR. PCR amplification was performed using the following set of steps and conditions: preincubation for 94 °C for 5 min, followed by 30 cycles of 1 min at 94 °C, 1 min at 53 °C and 2 min at 72 °C, and then a final extension step of 72 °C for 10 min. If the specific product was not generated by a single PCR, then 1 µL of this PCR product was used as the template for the second round of PCR. The following process is the same as described above in the section Screening of cDNA. Sequence, Structure, and Evolutionary Analyses. All cDNA and genomic DNA sequences were submitted into the GenBank database of NCBI (http://www.ncbi.nlm.nih.gov/, accession numbers GU292853-GU293141). The signal peptide was predicted with the SignalP 3.0 program (http://www.cbs.dtu.dk/ services/SignalP/). The propeptide cleavage site was confirmed by the N-terminal sequence of mature toxin. All peptide sequences from the precursor or mature region of hainantoxins (HNTXs) were searched in public databases (http://blast.ncbi.nlm.nih.gov/Blast.cgi). All precursor sequences were aligned using the ClustalW2 (http://www.ebi. ac.uk/Tools/clustalw2/index.html). The resulting alignment was imported into MEGA software23 to 2552

Journal of Proteome Research • Vol. 9, No. 5, 2010

construct phylogenetic tree by the neighbor-joining method, and bootstrap values were estimated from 500 replicates. The structure coordinates of huwentoxins (HWTXs) and HNTXs (1I25 of HWTX-II, 1Y29 of HWTX-X, 2JOT of HWTX-XI, 1NIX of HNTX-I, 2JTB of HNTX-III and 1NIY of HNTX-IV,) were downloaded from the PDB database.25 The predicted structures were modeled based on template structures (HNTX-II from HWTX-II; HNTX-X from HWTX-X; HNTX-XI from HWTX-XI). Molecular modeling was performed via submission to the fully automated protein structure homology-modeling server (http:// swissmodel.expasy.org/).26 Structure visualization was done with VMD software.27

Results Venomic Strategy: Parallel Transcriptomic, Peptidomic, and Genomic Analyses of Peptide Toxins from the Tarantula H. hainanum Venom. As is well-known, there are three levels (transcriptomic level, peptidomic level, and genomic level) at which spider peptide toxins can be identified. However, there is not a panacea for identification of all peptide toxins in a given spider species.9-12,15 To obtain overall identification of peptide toxins from the tarantula H. hainanum venom, we used a venomic strategy for this study. As shown in Figure 1, this strategy consists of three different approaches: (i) transcriptomics, that is, EST-based cloning and PCR-based cloning plus DNA sequencing; (ii) peptidomics, that is, off-line MDLC-MS plus peptide sequencing (direct Edman sequencing and bottom-up sequencing); (iii) genomics, that is, genomic DNA cloning plus DNA sequencing. Transcriptomic Analysis of Venom Peptides from Chinese Tarantula H. hainanum. The original H. hainanum venom gland library contained 3.3 × 106 independent clones according to the supplier’s instructions. After discarding the poor-quality sequences, 88 peptide toxins were deduced from the high quality ESTs (Figure 2). The length of complete cDNA encoding peptide toxin ranges from 0.46 to 0.6 kb, and the length of deduced precursor ranges from 65 to 117 amino acid residues. Besides, PCR-based cloning is also used to generate transcripts encoding peptide toxins, and it has the advantage of obtaining

Molecular Diversification of Peptide Toxins from Tarantula

research articles

Journal of Proteome Research • Vol. 9, No. 5, 2010 2553

research articles

Tang et al.

Figure 2. Sequences alignment of representative venom peptide precursors in superfamilies A-K from H. hainanum. (T1), peptide precursors identified by EST-based cloning at the transcriptomic level; (T2), novel peptide precursors identified by PCR-based cloning at the transcriptomic level; (P), mature peptides identified by peptidomic approach; (G), peptide precursors identified by genomic approach. The peptides from other spider species, which have high homology sequences or identical cysteine arrangement with HNTXs in all superfamilies from H. hainanum, are designated with asterisks. Signal peptide and propeptide are boxed and cysteines of mature peptide are in shadow. Rectangles denote additional C-terminal residues. Gaps (-) are introduced to optimize the sequence homology. Hydrophobic residues are shown in red, polar uncharged residues in green, basic residues in pink and acidic residues in blue.

low-abundance transcripts.28 And then 30 novel peptide toxins were obtained by PCR-based cloning (Figure 2) using specific primer pairs (Supplementary Table S1). In short, 207 precursors encoding 118 peptide toxins were characterized by transcriptomic approach (GenBank accession numbers GU292853GU293059). Peptidomic Analysis of Venom Peptides from Chinese Tarantula H. hainanum. The venom of Chinese tarantula H. hainanum was separated into two peaks (named P1 and P2) by gel filtration (Figure 3A). On the basis of results from Tricine2554

Journal of Proteome Research • Vol. 9, No. 5, 2010

SDS-PAGE (Figure 3B), the venom could be defined as two major parts: the protein components (MW > 10 kDa) and the peptide components (MW < 10 kDa). And then the peptide components were pooled and subjected to HPLC separation. The venom peptides (MW< 10 kDa) from gel filtration were loaded onto a cation-exchange HPLC apparatus, and eight fractions (named F1-8) were present in the elution profile (Figure 3C). The fractions were further separated by analytical C18 RP-HPLC column (Figure 3D-K). The peptide elution from the first RP-HPLC separation was collected and analyzed by

Molecular Diversification of Peptide Toxins from Tarantula

research articles

Figure 3. Fractionation of venom peptides from H. hainanum. (A) Venom sample was solubilized in 50 mM NH4HCO3 (pH 6.8) and then loaded onto a Sephadex G-75 column (10 × 600 mm) pre-equilibrated with 50 mM NH4HCO3 (pH 6.8). Elution of venom was carried out using the same buffer with a flow rate of 1.0 mL/min at room temperature (25 °C). The eluate was monitored at 215 nm. (B) The peaks (P1 and P2) eluted from panel A were further analyzed by Tricine-SDS-PAGE to check the molecular weight (MW) range. (C) On the basis of the results from panel B, the peak (P2) containing peptides (MW < 10 kDa) was pooled and then loaded onto a self-assembled column (Acell plus CM cation-exchange media, 10 × 200 mm) initially equilibrated with 0.02 M sodium phosphate buffer (pH 6.25). The column was eluted using a gradient of 0-75% of 1.0 M sodium chloride (pH 6.25) over 50 min at a constant flow rate of 2.5 mL/min. The effluents were monitored at 215 nm. (D-K) The fractions (F1-F8) eluted from panel C were applied to an analytical Phenomenex C18 RP-HPLC column (100 Å, 250 × 4.6 mm) and eluted at a flow rate of 1.0 mL/min using a gradient 5-20% buffer B (0.1% (v/v) TFA in ACN) over 10 min, followed by a gradient of 20%-45% buffer B over 35 min (Buffer A was 0.1% (v/v) TFA in water). The eluate was monitored at 215 nm. Journal of Proteome Research • Vol. 9, No. 5, 2010 2555

research articles

Tang et al. a

Table 1. Full or Partial Venom Peptide Sequences from H. hainanum by Peptidomic Analysis

a RT, retention time; MW, molecular weight ([M + H+]); ..., partial sequence. sequence determined by bottom-up is given in the gray box.

b

MALDI-TOF-TOF mass spectrometry. As shown in Supplementary Table S2, generally, there are two or more distinct components per effluent on the retention time (RT), and a specific component may be present in different effluents. In total, about 420 peptide toxins were detected, a vast majority of which fall in the 3000-5000 Da mass range. After multiple RP-HPLC separation, 49 venom peptides (Table 1) with high purity were selected for Edman degradation sequencing and/ or bottom-up sequencing (Figure 4). Seventeen venom peptides

were fully sequenced, and 32 venom peptides were only partially sequenced because of insufficient amount. Genomic Analysis of Venom Peptides from Chinese Tarantula H. hainanum. Using the same specific primer pairs (Supplementary Table S1), 82 precursors encoding 52 peptide toxins were characterized by genomic approach (Figure 2) (GenBank accession numbers GU293060-GU293141). An unexpected result indicated that any introns were not found in the genomic DNA structure of these toxins. The result is in

2556

Journal of Proteome Research • Vol. 9, No. 5, 2010

The sequence determined by Edman degradation is underlined, and the

insect Na+ channel inhibitor mammalian Na+ channel inhibitor mammalian Na+ channel inhibitor identical sequence to HWTX-III mammalian and insect Ca2+ channels inhibitor mammalian and insect Na+ channels inhibitor 54% identity to JZTX-33 76% identity to HmTx2 96% identity to HWTX-X 91% identity to HWTX-XVIIa identical sequence to HWTX-XVIIb1 70% identity to HWTX-XVIa1 identical sequence to HWTX-XVIa1 69% identity to HWTX-XI 93% identity to HWTX-XIVa1 98% identity to HWTX-XVIIId 81% identity to HWTX-XVIIIb mammalian and insect neurotoxin ICK ICK ICK unknown unknown unknown unknown unknown ICK unknown unknown unknown unknown Kunitz unknown unknown unknown DDH GEER GEER GEER QEER AEER PQER PQER TEAR IQER SKER LEQR RQKR RQKR HDGR No SEER ETAR DEER 27 aa 27 aa 28 aa 28 aa 29 aa 28 aa 28 aa 30 aa 17 aa 24 aa 24 aa 54 aa 53 aa 6 aa No 36 aa 27 aa 26 aa 21 aa 21 aa 21 aa 24 aa 21 aa 21 aa 33 aa 19 aa 20 aa 40 aa 40 aa 21 aa 21 aa 27 aa 21 aa 20 aa 18 aa 22 aa 83 aa 83 aa 86 aa 87aa 86 aa 85 aa 97 aa 90 aa 65 aa 98 aa 95 aa 117 aa 113 aa 88 aa 84 aa 117 aa 109 aa 85 aa K

a

H I J

G

D E F

B C

HNTX-I HNTX-III HNTX-IV HNTX-VIII HNTX-IX HNTX-VII HNTX-XII HNTX-XIII HNTX-X HNTX-XVII HNTX-XX HNTX-XIX HNTX-XVI HNTX-XI HNTX-XIV HNTX-XV HNTX-XVIII HNTX-II A

propeptide precursor superfamily

33 aa 33 aa 35 aa 33 aa 35 aa 33 aa 34 aa 41 aa 28 aa 34 aa 31 aa 42 aa 39 aa 55 aa 63 aa 61 aa 64 aa 37 aa

-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-C-C-CC-C-C-C-C-C-C-CC-C-C-C-C-CC-C-C-C-C-C-C-C-C-C-CC-C-C-C-C-C-C-C-C-C-CC-C-C-C-C-C-C-CC-C-C-C-C-C-C-C-C-C-

GK GK GK RR K FRK GK No No No No No No No No No No No

function or sequence identitya, b structure

extra C-terminal residue cysteine arrangement mature peptide propeptide processing signal signal peptide representative toxin

agreement with those of previous reports for HWTXs from H. schmidti,15,29 which is closely related to the species H. hainanum. Comprehensive Analysis of All Identified Peptide Toxins from the Tarantula H. hainanum Venom. The overall venom peptide (or precursor) sequences were identified by three different approaches. In total, 272 peptide precursors deduced from cDNA and genomic DNA sequences were characterized. Moreover, 192 nonredundant mature sequences were identified by three approaches (Figure 2 and Table 1). According to precursor sequence identity, venom peptide precursors can be categorized into 11 different superfamilies (superfamilies A-K), and then several superfamilies are further divided into distinct families (Figure 2). Almost all precursors from toxin superfamilies contain a signal peptide, a propeptide and a mature peptide. Molecular characters of the representative toxins in the superfamilies (and related families) from H. hainanum are summarized in Table 2. A phylogenetic tree was constructed by MEGA 3.1 using neighbor-joining method (Figure 5), and the result showed that all peptide precursors originated from two different clades. Members of superfamilies A-I belong to one clade, and those of the remaining superfamilies belong to the other clade. On the basis of mature sequence identity (Table 1), peptide toxins identified by peptidomics were clustered into different superfamilies. Superfamily A. Superfamily A includes four families (HNTX-I family, HNTX-III family, HNTX-IV family, HNTX-VIII family). These families exhibit a variety of biological activities, and most mature peptides have a consensus “-C-C-CC-C-C-” cysteine arrangement. Interestingly, the precursors of these families have additional residues at the C-terminus of the mature peptides, which were removed during the post-translational processing. The prepropeptides of HNTX-I family, HNTX-III family, and HNTX-IV family show a high sequence identity; particularly their signal peptides are more highly conserved. Furthermore, the precursors of the three families contain a consensus “ GEER ” cleavage signal for propeptide processing enzyme, and a consensus “ GK ”, which is not only the extra C-terminal residues but also an amidation signal. In our previous work, three-dimensional structures of HNTX-I, HNTX-III, and HNTXIV (which are the representative toxins from HNTX-I family, HNTX-III family, and HNTX-IV family, respectively) have been elucidated by 1H nuclear magnetic resonance (NMR) tech-

Table 2. Molecular Characters of the Representative Toxins in the Superfamilies (And Related Families) from the Spider H. hainanum

Figure 4. MS/MS spectrum for parent ion m/z 1705.644 (from enzymatic fragment of HNTX-IX-2). The amino acid sequence, WYLGGCSQDGDCCK, was derived from series of y-ions. Leucine/ isoleucine (isobaric amino acids) and glutamine/lysine (quasiisobaric ions) were confirmed by Edman sequencing.

The sequence identity only for mature peptide sequence. b The matched peptide toxins with known function: HWTX-III, mammalian and insect neurotoxin; HmTx2, Kv2.1 potassium channel inhibitor; HWTX-X, mammalian Ca2+ channel blocker; HWTX-XVIIb1, mammalian Ca2+ channel inhibitor; HWTX-XVIa1, mammalian neurotoxin; HWTX-XI, trypsin inhibitor and Kv1.1 channel inhibitor.

research articles

Molecular Diversification of Peptide Toxins from Tarantula

Journal of Proteome Research • Vol. 9, No. 5, 2010 2557

research articles

Tang et al.

Figure 5. Phylogenetic tree of all venom peptide precursors in the HNTX superfamily from H. hainanum (A, superfamily A; B, superfamily B; C, superfamily C; D, superfamily D; E, superfamily E; F, superfamily F; G, superfamily G; H, superfamily H; I, superfamily I; J, superfamily J; K, superfamily K).

nique,17,20 and display the inhibitor cystine knot (ICK) motif commonly formed by three disulfide bridges with a linkage pattern of I-IV, II-V, and III-VI (Figure 6A-C). The electrophysiological results indicate that HNTX-I selectively blocks rNav1.2/β1 and para/tipE channels expressed in Xenopus laevis oocytes and is a novel insect sodium channel inhibitor.17 Both HNTX-III and HNTX-IV affect tetrodotoxin-sensitive (TTX-S) Na+ currents on rat dorsal root ganglion neurons, and are the mammalian neural Na+ channels inhibitors. However, they show difference in the reprime kinetics of voltage-gated sodium channels (VGSCs).18 The precursors of HNTX-VIII family show extremely high identity with that of HWTX-III (over 87%) from H. schmidti,12 and their precursors contain the additional residues “RR” at the C-terminus of mature peptides. HNTX-VIII is the representative toxin in HNTX-VIII family, and its mature peptide has the same sequence with that of HWTX-III. Previous results show that HWTX-III has a disulfide bonding pattern (I-IV, II-V, and III-VI). It reversibly paralyze cockroaches for several 2558

Journal of Proteome Research • Vol. 9, No. 5, 2010

hours, and can also enhance the muscular contractions elicited by stimulating the nerve of the isolated rat vas deferens.30 Superfamily B. The precursors of superfamily B exhibit extremely high sequence similarity (over 96%) with that of HWTX-V from H. schmidti,31 and their precursors have an additional residue “K” at the C-terminus of mature peptides. Electrophysiological test shows that HNTX-IX can reversibly block N-type and P/Q-type voltage-gated calcium channels (VGCCs) in rat dorsal root ganglion neurons, and it can also reversibly inhibit VGCCs in cockroach dorsal unpaired median neurons (unpublished data). Another important toxin from superfamily B is HNTX-IX-2, the mature peptide of which has the same sequence with that of HWTX-V. Toxicity assays indicate that HWTX-V can reversibly paralyze locusts and cockroaches, and causes death at high doses. It has no effect on mice by intra-abdominal, nor intracerebroventricular injection.32 Whole-cell patch-clamp configuration indicates that HWTX-V specifically inhibits high-voltage-activated calcium channels in adult cockroach dorsal unpaired median neurons

Molecular Diversification of Peptide Toxins from Tarantula

research articles

Figure 6. Three dimensional structures from six HNTXs (A, HNTX-I; B, HNTX-III; C, HNTX-IV; D, HNTX-X; E, HNTX-XI; F, HNTX-II). Richardson-style diagrams of the backbone folding of these peptide toxins were performed by VMD software. The turns and random coils are colored in blue, the β-sheet is shown in yellow, and the R-helix is indicated in purple. The disulfide bonds are shown in cyan.

while having no evident effect on voltage-gated potassium and sodium channels.33 Previous results also show that HWTX-V has three disulfide bridges with a linkage pattern of I-IV, II-V, and III-VI. Superfamily C. This superfamily contains HNTX-XII family and HNTX-VII family. In the precursor of HNTX-XII, we found the dual-residue “GK” at the C-terminus of the mature peptide, which is also present in those of HNTX-I family, HNTX-III family, and HNTX-IV family. So, “GK” is considered as the extra C-terminal residue (and amidation signal) in HNTX-XII. The signal peptide and mature peptide of HNTX-VII have different sequences with those of HNTX-XII, and their additional Cterminal residues are also dissimilar. However, their propeptides have identical length and high sequence similarity (about 86%), and their precursors contain a consensus propeptide processing signal (PQER). Electrophysiological tests show that HNTX-VII is a mammalian and insect Na+ channels inhibitor (unpublished data). Superfamily D. Mature peptides of superfamily D have high similarity (over 68%) with those of the known toxin Heteroscodratoxin-2 (HmTx2), a specific inhibitor of Kv2.1 voltage-gated K+ channels from the tarantula Heteroscodra maculata,34 which has a disulfide bonding pattern (I-IV, II-V, and III-VI). All of 12 members in superfamily D were only obtained by transcriptomics, and these peptide toxins (or their isoforms) may have low abundance at the peptidomic and genomic level. The signal peptide cleavage site is at Ser19 residue, and it is different from the conserved Ala residue in most superfamilies from H. hainanum. Superfamily E. The precursors (or mature peptides) of superfamily E are the shortest in all the HNTXs. Their precursors share extremely high sequence identity (over 80%) with that of HWTX-X,12 which is a mammalian Ca2+ channel blocker from H. schmidti and can reversibly block N-type calcium channels in rat dorsal root ganglion neurons. The mature

peptide of HWTX-X adopts an ICK structural motif with a I-IV, II-V, and III-VI disulfide pairing. Moreover, its structure contains a functional motif, which has a binding surface formed by the critical residue Tyr10, and several basic residues (Lys1, Lys7, Lys15, and Lys26).35 Apart from neutral residue (Asn26) as a substitute for Lys26, there are the critical residue Tyr10 and the other three basic residues (Lys1, Lys7, Lys15) in the mature peptides of HNTX-X (which is the representative toxin from superfamily E). By structure modeling (Figure 6D), HNTX-X has a similar structure (an ICK structural motif) with HWTX-X. Superfamily F. Superfamily F contains HNTX-XVII family and HNTX-XX family. The prepropeptides of the two families have identical length, whereas their propeptide processing signals are different and the mature peptide of HNTX-XX family is shorter than that of HNTX-XVII family. HNTX-XVII and HNTX-XX are the representative toxins in superfamily F, and their mature peptides have a consensus cysteine arrangement (-C-C-CC-C-C-C-C-). In addition, there are a pair of CXC fragments in the C-region; that is, cysteine 5 and 6, as well as cysteine 7 and 8, are separated by a single amino acid residue. This Extra Structural Motif (ESM)36 has also been found in the HWTX-XVII superfamily from H. schmidti.12,15 Furthermore, the mature peptide of HNTX-XX has identical sequence to that of HWTX-XVIIb1, a mammalian Ca2+ channel inhibitor (unpublished data). Superfamily G. HNTX-XVI family and HNTX-XIX family belong to superfamily G. The two families have identical propeptide processing signal (RQKR), and their signal peptides are highly conserved. Nevertheless, the number of members in HNTX-XVI family and HNTX-XIX family is obviously distinct, and the latter has only one. The precursors of HNTX-XVI family have high sequence similarity with those of the HWTX-XVI superfamily from H. schmidti.12 HNTX-XVI is the representative toxin in the HNTX-XVI family, and its mature peptide has the same sequence with that of HWTX-XVIa1, a mammalian Journal of Proteome Research • Vol. 9, No. 5, 2010 2559

research articles neurotoxin from H. schmidti. Moreover, the other two members (HNTX-XVI-5 and HNTX-XVI-16) in HNTX-XVI family have identical mature sequence with HWTX-XVIa5 and HWTXXVIa10, respectively. Superfamily H. The precursors of superfamily H exhibit high sequence similarity with those of the HWTX-XI superfamily from H. schmidti.15,37 Previous results indicate that according to the number of disulfide bonds, the HWTX-XI superfamily can be categorized into two groups: the first group (e.g., HWTXXI) follows classical Kunitz architecture formed by three disulfide bridges with a linkage pattern of I-VI, II-IV, and III-V, whereas the second has lost the II-IV disulfide bond for the replacement of cysteine IV by tyrosine. Members of the latter group have been designated as sub-Kunitz type toxins.37 In superfamily H, almost all members belong to sub-Kunitz type toxins. By homology modeling for HNTX-XI (Figure 6E), which is the representative toxin from superfamily H from H. hainanum, we found that this sub-Kunitz type toxin still has a Kunitz motif formed by the remaining two disulfide bridges. In the reduction experiment for the II-IV disulfide bond of bovine pancreatic trypsin inhibitor (BPTI), native Kunitz motif of which is stabilized by three disulfide bridges with the bonding pattern of I-VI, II-IV, and III-V, the resulting product with the remaining two disulfide bonds still has an native-like conformation and trypsin inhibitor activity.38 In addition, native sub-Kunitz type peptides or proteins without the II-IV disulfide bond have also been reported in two cone snail toxins (conkunitzin-S1, S2),39,40 ixolaris from the salivary gland of the tick,41-43 and trophoblast Kunitz domain protein-3 (TK-3) from cow or sheep.40,44 Two phylogenetic trees were constructed for the trophoblast Kunitz domain proteins and tarantula toxins, respectively. The results suggest that their ancestral protein had three disulfides.37,40 At the transcriptomic level, in published papers, we have cloned 11 isoforms of HNTX-XI from H. hainanum by RACEPCR using 3′RACE primer (gene-specific primer) and 5′RACE primer (nonspecific primer).37 In this study, we also obtained these peptide toxins by PCR/RT-PCR using a pair of primers, 3′ and 5′ gene-specific primers. Besides, at the genomic level, we identified some novel peptide toxins by genomic DNA PCR. Superfamily I. Mature peptides of superfamily I except HNTX-XIV-2 share the same “-C-C-CC-C-C-C-C-C-C-” cysteine arrangement, which contains the largest number of cysteine residues in the HNTXs. This cysteine arrangement is identical to those of HWTX-XIVa1 and HWTX-XIVa2 in the HWTX-XIV superfamily from H. schmidti.12 The precursors of superfamily I show high sequence similarity with those of the two HWTXs. Moreover, PRTx16C0 and PNTx16C1, two nontoxic peptides from Phoneutria reidyi, contain also such cysteine arrangement.45 Curiously, superfamily I is the absence of propeptide sequences, and the signal peptide cleavage site is Cys21 residue. This character has also been found in the HWTXXIV superfamily. Superfamily J. Superfamily J includes two families (HNTXXVIII and HNTX-XV family), most mature peptides of which have a consensus “-C-C-C-CC-C-C-C-” mode. This cysteine framework has been found in peptide toxins from the other spider species: HWTX-XVIIIb and HWTX-XVIIId (H. schmidti);12,15 jingzhaotoxin-62 (JZTX-62), JZTX-63, JZTX-64 and JZTX-65 (C. guangxiensis);10 magi-15 and magi-16 (Macrothele gigas);46 LSTX-R1 (Lycosa singoriensis).47 The precursors of HNTX-XVIII family show extremely high sequence identity (over 86%) with that of HWTX-XVIIIb from 2560

Journal of Proteome Research • Vol. 9, No. 5, 2010

Tang et al. 12,15

H. schmidti, and even their propeptide processing signals (ETAR) are the same. In all HNTXs, HNTX-XVIII family has the shortest signal peptide and the longest mature peptide. The precursors of HNTX-XV family share extremely high sequence similarity (over 94%) with that of HWTX-XVIIId from H. schmidti.12 Their precursors have only several different residues, and the “SEER” cleavage signal for propeptide processing enzyme is identical. The precursors of HNTX-XV family have a Ser20 residue responsible for the signal peptide cleavage site, which is similar to Ser19 residue from superfamily D. Superfamily K. The precursors of superfamily K have extremely high sequence identity with those of the HWTX-II superfamily from H. schmidti.12 Most mature peptides of superfamily K have six cysteine residues and display the “-C-C-C-C-C-C-” cysteine arrangement. HNTX-II, which is the representative toxin from superfamily K, can reversibly paralyze cockroaches by intra-abdominal injection, and can also kill mice by intracerebroventricular injection. By structure modeling (Figure 6F), HNTX-II has a similar structure with HWTX-II48 and contains the disulfide-directed β-hairpin (DDH) motif with a disulfide bonding pattern (I-III, II-V, and IV-VI), which is an unusual motif different from the typical ICK motif with a I-IV, II-V, III-VI disulfide bonding pattern. Moreover, peptide toxins with a DDH-derived fold from the other tarantula species have also been identified.10,12

Discussion As previously described in documents,9-12,15 there are some methods for identification of spider peptide toxins at three levels (transcriptomic level, peptidomic level, and genomic level). In spite of extensive development of these methods or techniques, there is no unique identification method for all peptide toxins in a given spider species. Moreover, because of inherent limitations of current technique (e.g., EST-based cloning, Edman degradation, and mass spectrometry),28,36,49-51 the combination strategy is necessary for high-throughput identification of venom peptides. First, at the transcriptomic level, EST-based cloning is an universal and practical approach to explore non-normalized cDNA library from the spider venom glands,10,12,36 but the result will be biased toward the highly abundant transcripts and lower abundance molecular species will likely escape detection.28,36,49 On the other hand, the application of PCRbased cloning directly complements the EST-based approach. PCR-based cloning using the specific primer is a useful technique for discovery and identification of novel member in gene families, and then it has the advantage of obtaining lowabundance transcripts.28 Second, at the peptidomic level, some investigators believe that the conventional two-dimensional gel electrophoresis (2DE) is not suitable for peptidomic profile analysis of spider venom due to the enormous diversity of spider venom peptides and the limited sequences in the database.49 Moreover, the total number of peptide components and the full range of peptidomic profile complexity from the unfractionated sample were not detected by directly using MALDI-TOF MS, owing to the effects of ion suppression and insufficient resolution.49-51 Fortunately, these difficulties have been resolved by the strategy of an off-line MDLC-MS.9,11,49 Subsequently, Edman degradation sequencing combined with bottom-up mass spectrometric sequencing enables us to improve both the efficiency and sensitivity in determination of the sequence of the purified

Molecular Diversification of Peptide Toxins from Tarantula peptide toxins. Edman degradation sequencing allows the ordered amino acid sequence of a peptide or protein to be discovered, from which the proteolytic cleavage sites of peptide precursors can be recognized. Although Edman degradation sequencing is a very important technique for N-terminal sequence analysis of the purified peptide toxins, it is limited by slow speed, the need for rather large amounts of material (at least 50-100 pmol), and the blocked N-terminal amino acids. Bottom-up sequencing, a new technology for the analysis of peptide or protein sequences by tandem mass spectrometry, has the advantage of speed and sensitivity (picomole to subpicomole range). It can also detect modified amino acids. Nevertheless, it is difficult to generate the whole sequence of a peptide toxin via bottom-up sequencing, and leucine/ isoleucine (isobaric amino acids) and glutamine/lysine (quasiisobaric ions) cannot be distinguished by this method.51 Finally, at the genomic level, the genomic DNA of spider toxins can be successfully cloned using specific primer pairs based on 5′ and 3′ UTR sequences from cDNA. Some spider toxins contain the intron,52,53 and others have not any intron.15,29,54 In particular, the intron is lacking in the genomic DNA of three toxin superfamilies from the spider H. schmidti,15 which is closely related to the species H. hainanum. Thus, an intronless feature in genomic DNA encoding peptide toxins will reduce the complexity of the DNA template and contribute to cloning of novel peptide toxins. But, as the template, total genomic DNA contains full genomic DNA information of all organs from the source organism, not just venom gland. So, nonspecific products may be easily generated by genomic DNA PCR. To the best of our knowledge, in the present report, we first have employed a combination strategy to systematically elucidate venom peptide profile of tarantula species. The results indicated that 118 peptide toxins were identified by transcriptomic approach, 49 by peptidomic approach, and 52 by genomic approach. After redundancy removal, 192 peptide toxins in total were obtained by three approaches. Only 20 peptide toxins were identified by two or three approaches. However, most peptide toxins were uniquely identified by a single approach for their lower abundance in at least two of three levels mentioned above. It is also possible that the identification of most peptide toxins was biased toward a certain approach, and these peptides were difficultly identified by the other two approaches because of the limitations of current identification techniques. And it should be noted that post-translational modifications in spider peptide toxins may be more complex than previously appreciated, and might result in the lower degree of venom composition accordance between these approaches. Further work is needed to clarify these reasons. The observation is also present in other tarantula venoms. Previous peptidomic and transcriptomic analyses for the H. schmidti venom showed that the few sequences were simultaneously identified.11,12 The genomic DNA information of three toxin superfamilies from this spider has been also reported,15 and most novel peptide toxins were not found in the peptidome and transcriptome. Nevertheless, the combination strategy greatly enhances identification throughput, and detection rang of peptide toxins is compensated by each approach. On the other hand, there are six superfamilies (superfamilies A, B, G, H, J, and K), each of which displays isoform sequence information from transcriptomic, peptidomic, and genomic levels; 4 superfamilies (superfamilies C, E, F, and I), each of which shows isoform sequence information

research articles from two of three levels; and one superfamily (superfamily D), which exhibits only isoform sequence information from transcriptomic level. On the whole, the combination strategy can contribute to providing more useful sequence information from three different levels for the HNTX superfamily, and will help us better understand the complexity of the H. hainanum venom. In the representative precursors of all superfamilies from H. hainanum (Figure 2), there is the Processing Quadruplet Motif (PQM)36 just before the mature peptides, except for superfamily I which lacks a propeptide. All mature peptides in superfamilies contain 5-10 cysteine residues, and their cysteine arrangements were classified into six types (Table 2). These cysteine arrangements have been found in peptide toxins from other tarantula or spider species. And three kinds of cysteine arrangements correlated with known structural scaffold were mainly discussed in this report: (1)-C-C-CC-C-C-, which is present in the mature peptides of superfamilies A-E and superfamily G, is the main cysteine arrangement for HNTXs. More than half the HNTXs have such cysteine arrangement. This cysteine arrangement has been found in spider toxins and conotoxins with the ICK motif. On the whole, besides PQM, most precursors of these superfamilies have the primary structure motif,36 Principal Structural Motif (PSM). The first two Cys residues are invariably separated by six amino acids, and there are no amino acids between the third and fourth cysteine residues. In addition, there is a distance of 5-10 amino acid residues between the Cys-Cys sequence and the second Cys. On the other hand, threedimensional structures of HNTX-I, HNTX-III, HNTX-IV and HNTX-X determined by 1H NMR technique or homology modeling, display the ICK motif with a disulfide bonding pattern of I-IV, II-V, and III-VI17,20 (Figure 6A-D). Hitherto, all tarantula toxins with the cysteine arrangement “-C-C-CC-C-C-”, the three-dimensional structures of which have been elucidated,25 are referred to as the ICK toxins. Its antiparallel β-sheet backbone is cross-linked by two intramolecular disulfide bridges to form a ring penetrated by a third disulfide bridge.55 The tarantula toxins with such molecular scaffold have various pharmacological activities, and can modulate the currents in different ion channels, such as potassium, calcium, and sodium channels. The function diversity in this molecular scaffold has been thoroughly reviewed.6,55-57 (2) -C-C-C-C-C-, which is found in mature peptides of superfamily H, is the cysteine arrangement of sub-Kunitz type toxins. In this case, the mature peptide has lost the fourth cysteine residue from classical Kunitz-type toxins such as HWTX-XI with a disulfide linkage pattern of I-VI, II-IV, and III-V,12,37 and as a result the II-IV disulfide bridge is discarded. However, structure modeling showed that HNTX-XI still has a Kunitz motif formed by the remaining two disulfide bridges (Figure 6E). Kunitz-type proteins or peptides display a variety of bioactivities as enzyme (chymotrypsin or trypsin) inhibitors, potassium channel blockers, and dual-function toxins.37 Furthermore, these proteins or peptides have been widely found in the venom of animals such as snake, cone snail and sea anemone. So far, Kunitz-type peptides from spider species have been identified in the venoms of the tarantulas H. schmidti and H. hainanum. HNTX-XI exhibits high sequence identity with HWTX-XI, a bifunctional peptide toxin which is a very potent trypsin inhibitor as well as a weak potassium channel blocker. So we presumed that HNTX-XI has the bioactivity of enzyme Journal of Proteome Research • Vol. 9, No. 5, 2010 2561

research articles inhibitor or ion channel blocker. But further bioactive experiments is needed to confirm this speculation. (3) -C-C-C-C-C-C-, which is present in the mature peptides of superfamily K, is different from the “-C-C-CCC-C-” cysteine framework of superfamilies A-E and superfamily G. On the basis of homology modeling, HNTX-II adopts a DDH motif formed by three disulfide bonds with a linkage pattern of I-III, II-V, and IV-VI (Figure 6F). The DDH motif lacks the cystine knot for its unique disulfide linkage. Nevertheless, according to the original definition of the β-hairpin stabilized by two disulfide bridges, all ICK toxins contain a DDH fold, and therefore the ICK scaffold should be considered a molecular evolution of the DDH motif.6,48,58 HWTX-II, which is a neurotoxin affecting both mammals and insects,59 represents the first example of a DDH toxin in tarantula venoms. HNTX-II shares extremely high sequence identity (over 90%) with HWTX- II, and their bioactivities are similar. Insect toxicity of HNTX-II is about 6-fold stronger than that of HWTX-II, but mammalian toxicity of HNTX-II is about 4-fold lower than that of HWTX-II (unpublished data). To date, the DDH toxins with “-C-C-C-C-C-C-” cysteine arrangement have been found in the venoms from three tarantula species.10,12 In the studies of conotoxins, investigators have found that conotoxins originate from a limited set of gene superfamilies and molecular scaffolds, and gene duplication accompanied by selective hypermutation of the residues encoded in the mature peptide sequence has resulted in the enormous molecular diversity of conotoxins.60,61 It has been proposed that this is common mechanism to spider species.55,62 In this study, our results support this theory. The application of a venomic strategy derived from transcriptomics, peptidomics, and genomics has demonstrated the enormous molecular diversity from tarantula-venom peptides. Multiple isoforms with high homology have been found in the superfamilies from the tarantula H. hainanum venom, suggesting that based on gene superfamilies and molecular scaffolds, gene duplication and hypermutation may be responsible for the molecular diversity. The ratios of nonsynonymous to synonymous substitutions in this case also seem to support the above hypothesis (unpublished data). In addition, the precursors of the HNTX-I, HNTXIII, HNTX-IV families (from superfamily A) and HNTX-XII family (from superfamily C) have a consensus “ GK ” (Figure 2 and Table 2), which is not only the extra C-terminal residues but also an amidation signal. Mass spectrometric experiments have shown that their mature peptides are amidated.17,19 The additional residue and C-terminal amidation signal were also reported for the other tarantula-venom peptides, HWTXs31 and JZTXs.9,10 Post-translational modifications such as C-terminal amidation may be the other potential mechanism of toxin diversification. By using BLAST methods, we have found that a large number of precursors or mature peptides of HNTXs from H. hainanum show high sequence similarity with those of HWTXs from H. schmidti.11,12,15 There are even several identical mature sequences of venom peptides from the two related but different Chinese tarantulas. Moreover, peptide toxins with identical mature sequence have been found in other tarantula or spider species (Supplementary Table S3), and even other venomous species (Supplementary Table S4), for example, cone snails, soapfish, scorpions, and snakes (http://protchem.hunnu.edu.cn/toxin/).63 Almost all of these peptide toxins were identified in 2-3 different species. And unexpectedly, a Bradykininpotentiating peptide was found in eight different snake venoms. 2562

Journal of Proteome Research • Vol. 9, No. 5, 2010

Tang et al. It is suggested that this character could provide more evidence for venomous taxonomy.

Concluding Remarks In conclusion, by using the combination of transcriptomics, peptidomics, and genomics, this study demonstrates an overview of tarantula-venom peptides from H. hainanum. The combination strategy greatly enhances identification throughput so that a huge number of peptide toxins were identified. The results present evidence that the mechanism to molecular diversity of spider peptide toxins is similar with that of conotoxins. Our investigation increases the knowledge of venom complexity and properties of venom components, and provides a better understanding for molecular diversity, structure and function, and evolutionary relationship of spider peptide toxins. The venom peptides identified from H. hainanum may serve as a reference for further case-by-case investigation of these peptide toxins. Hopefully, more novel components with unique bioactivies might be discovered after a detailed identification. Abbreviations: MDLC-MS, multiple dimensional liquid chromatography coupled with mass spectrometry; ACN, acetonitrile; UTR, untranslated region; MW, molecular weight; RP, reverse phase; CHCA, R-cyano-4-hydroxycinnamic acid; PMF, peptide mass fingerprinting; HNTX, hainantoxin; HWTX, huwentoxin; RT, retention time; NMR, nuclear magnetic resonance; ICK, inhibitor cystine knot; TTX-S, tetrodotoxin-sensitive; VGSC, voltage-gated sodium channel; VGCC, voltage-gated calcium channel; ESM, Extra Structural Motif; BPTI, bovine pancreatic trypsin inhibitor; TK-3, trophoblast Kunitz domain protein-3; JZTX, jingzhaotoxin; DDH, disulfide-directed β-hairpin; 2-DE, two-dimensional gel electrophoresis; PQM, Processing Quadruplet Motif; PSM, Principal Structural Motif.

Acknowledgment. This work was supported by a grants from National Natural Science Foundation of China (NO. 30430170) and National 973 Project of China (NO. 2007CB914203 and 2010CB529800). We are grateful to Jinjun Chen for expert assistance in transcriptomic experiment. We also thank Jixian Xiong for expert assistance in MALDITOF-TOF MS/MS analysis. Supporting Information Available: Tables of specific primer pairs of cDNA screening and genomic DNA PCR for peptide toxin superfamilies (or families) from H. hainanum; mass fingerprinting of venom peptides from H. hainanum; the same mature sequences of peptide toxins from different tarantula or spider species and other venomous animals. This material is available free of charge via the Internet at http:// pubs.acs.org. References (1) Escoubas, P.; Diochot, S.; Corzo, G. Structure and pharmacology of spider venom neurotoxins. Biochimie 2000, 82 (9-10), 893–907. (2) Rash, L. D.; Hodgson, W. C. Pharmacology and biochemistry of spider venoms. Toxicon 2002, 40 (3), 225–254. (3) Corzo, G.; Escoubas, P. Pharmacologically active spider peptide toxins. Cell. Mol. Life Sci. 2003, 60 (11), 2409–2426. (4) Estrada, G.; Villegas, E.; Corzo, G. Spider venoms: a rich source of acylpolyamines and peptides as new leads for CNS drugs. Nat. Prod. Rep. 2007, 24 (1), 145–161. (5) King, G. F. The wonderful world of spiders: preface to the special Toxicon issue on spider venoms. Toxicon 2004, 43 (5), 471–475. (6) Escoubas, P.; Rash, L. Tarantulas: eight-legged pharmacists and combinatorial chemists. Toxicon 2004, 43 (5), 555–574.

research articles

Molecular Diversification of Peptide Toxins from Tarantula (7) Wood, D. L.; Miljenovic´, T.; Cai, S.; Raven, R. J.; Kaas, Q.; Escoubas, P.; Herzig, V.; Wilson, D.; King, G. F. ArachnoServer: a database of protein toxins from spiders. BMC Genomics 2009, 10, 375. (8) Legros, C.; Celerier, M.-L.; Henry, M.; Guette, C. Nanospray analysis of the venom of the tarantula Theraphosa leblondi: a powerful method for direct venom mass fingerprinting and toxin sequencing. Rapid Commun. Mass Spectrom. 2004, 18 (10), 1024–1032. (9) Liao, Z.; Cao, J.; Li, S, M.; Yan, X. J.; Hu, W. J.; He, Q. Y.; Chen, J. J.; Tang, J. Z.; Xie, J. Z.; Liang, S. P. Proteomic and peptidomic analysis of the venom from Chinese tarantula Chilobrachys jingzhao. Proteomics 2007, 7 (11), 1892–1907. (10) Chen, J. J.; Deng, M. C.; He, Q. Y.; Meng, E.; Jiang, L. P.; Liao, Z.; Rong, M. Q.; Liang, S. P. Molecular diversity and evolution of cystine knot toxins of the tarantula Chilobrachys jingzhao. Cell. Mol. Life Sci. 2008, 65 (15), 2431–2444. (11) Yuan, C. H.; Jin, Q. H.; Tang, X.; Hu, W. J.; Cao, R.; Yang, S. Q.; Xiong, J. X.; Xie, C. L.; Xie, J. Y.; Liang, S. P. Proteomic and peptidomic characterization of the venom from the Chinese bird spider, Ornithoctonus huwena Wang. J. Proteome Res. 2007, 6 (7), 2792–2801. (12) Jiang, L. P.; Peng, L.; Chen, J. J.; Zhang, Y. Q.; Xiong, X.; Liang, S. P. Molecular diversification based on analysis of expressed sequence tags from the venom glands of the Chinese bird spider Ornithoctonus huwena. Toxicon 2008, 51 (8), 1479–1489. (13) Guette, C.; Legros, C.; Tournois, G.; Goyffon, M.; Ce´le´rier, M.-L. Peptide profiling by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry of the Lasiodora parahybana tarantula venom gland. Toxicon 2006, 47 (6), 640–649. (14) Herzig, V.; Hodgson, W. C. Intersexual variations in the pharmacological properties of Coremiocnemis tropix (Araneae, Theraphosidae) spider venom. Toxicon 2009, 53 (2), 196–205. (15) Jiang, L. P.; Chen, J. J.; Peng, L.; Zhang, Y. Q.; Xiong, X.; Liang, S. P. Genomic organization and cloning of novel genes encoding toxin-like peptides of three superfamilies from the spider Orinithoctonus huwena. Peptides 2008, 29 (10), 1679–1684. (16) Liang, S. P.; Peng, X. J.; Huang, R. H.; Chen, P. Biochemical identification of Selenocosmia hainana sp. nov. from south China [Araneae Theraphosidae]. Life Sci. Res. 1999, 3, 299–303. (17) Li, D. L.; Xiao, Y. C.; Hu, W. J.; Xie, J. Y.; Bosmans, F.; Tytgat, J.; Liang, S. P. Function and solution structure of hainantoxin-I, a novel insect sodium channel inhibitor from the Chinese bird spider Selenocosmia hainana. FEBS Lett. 2003, 555 (3), 616–622. (18) Xiao, Y. C.; Liang, S. P. Inhibition of neuronal tetrodotoxin-sensitive Na+ channels by two spider toxins: hainantoxin-III and hainantoxin-IV. Eur. J. Pharmacol. 2003, 477 (1), 1–7. (19) Liu, Z. H.; Dai, J.; Chen, Z.; Hu, W. J.; Xiao, Y. C.; Liang, S. P. Isolation and characterization of hainantoxin-IV, a novel antagonist of tetrodotoxin-sensitive sodium channels from the Chinese bird spider Selenocosmia hainana. Cell. Mol. Life Sci. 2003, 60 (5), 972–978. (20) Li, D. L.; Xiao, Y. C.; Xu, X.; Xiong, X.; Lu, S. Y.; Liu, Z. H.; Zhu, Q.; Wang, M. C.; Gu, X. C.; Liang, S. P. Structure-activity relationships of hainantoxin-IV and structure determination of active and inactive sodium channel blockers. J. Biol. Chem. 2004, 279 (36), 37734–37740. (21) Xiao, Y. C.; Liang, S. P. Purification and characterization of Hainantoxin-V, a tetrodotoxin-sensitive sodium channel inhibitor from the venom of the spider Selenocosmia hainana. Toxicon 2003, 41 (6), 643–650. (22) Pan, J. Y.; Hu, W. J.; Liang, S. P. Purification, sequencing and characterization of hainantoxin-VI, a neurotoxin from the chinese bird spider Selenocosmia hainana. Zool. Res. 2002, 23 (4), 280– 283. (23) Kumar, S.; Nei, M.; Dudley, J.; Tamura, K. MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Briefings Bioinform. 2008, 9 (4), 299–306. (24) Liang, S. P.; Qin, Y. B.; Zhang, D. Y.; Pan, X.; Chen, X. D.; Xie, J. Y. Biological characterization of spider (Selenosmia huwena) crude venom. Zool. Res. 1993, 14 (1), 60–65. (25) Sussman, J. L.; Lin, D.; Jiang, J.; Manning, N. O.; Prilusky, J.; Ritter, O.; Abola, E. E. Protein Data Bank (PDB): Database of threedimensional structural information of biological macromolecules. Acta Crystallogr., D: Biol. Crystallogr. 1998, 54 (Pt. 6 Pt. 1), 1078– 1084. (26) Arnold, K.; Bordoli, L.; Kopp, J.; Schwede, T. The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling. Bioinformatics 2006, 22 (2), 195–201. (27) Humphrey, W.; Dalke, A.; Schulten, K. VMD: visual molecular dynamics J. Mol. Graphics 1996, 14 (1), 33-38, 27-28. (28) Pan, Z. S.; Barry, R.; Lipkin, A.; Soloviev, M. Selection strategy and the design of hybrid oligonucleotide primers for RACE-PCR:

(29)

(30)

(31)

(32)

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

(42)

(43)

(44)

(45)

(46)

(47)

cloning a family of toxin-like sequences from Agelena orientalis. BMC Mol. Biol. 2007, 8, 32. Qiao, P.; Zuo, X. P.; Chai, Z. F.; Ji, Y. H. The cDNA and genomic DNA organization of a novel toxin SHT-I from spider Ornithoctonus huwena. Acta Biochim. Biophys. Sin. 2004, 36 (10), 656–660. Huang, H. R.; Liu, Z. H.; Liang, S. P. Purification and characterization of a neurotoxic peptide Huwentoxin-III and a natural inactive mutant from the venom of the spider Selenocosmia huwena Wang (Ornithoctonus huwena Wang). Acta Biochim. Biophys. Sin. 2003, 35 (11), 976–980. Diao, J. B.; Lin, Y.; Tang, J. Z.; Liang, S. P. cDNA sequence analysis of seven peptide toxins from the spider Selenocosmia huwena. Toxicon 2003, 42 (7), 715–723. Zhang, P. F.; Chen, P.; Hu, W. J.; Liang, S. P. Huwentoxin-V, a novel insecticidal peptide toxin from the spider Selenocosmia huwena, and a natural mutant of the toxin: indicates the key amino acid residues related to the biological activity. Toxicon 2003, 42 (1), 15–20. Deng, M. C.; Luo, X.; Meng, E.; Xiao, Y. C.; Liang, S. P. Inhibition of insect calcium channels by huwentoxin-V, a neurotoxin from Chinese tarantula Ornithoctonus huwena venom. Eur. J. Pharmacol. 2008, 582 (1-3), 12–16. Escoubas, P.; Diochot, S.; Ce’le’rier, M. L.; Nakajima, T.; Lazdunski, M. Novel tarantula toxins for subtypes of voltage-dependent potassium channels in the Kv2 and Kv4 subfamilies. Mol. Pharmacol. 2002, 62 (1), 48–57. Liu, Z. H.; Dai, J.; Dai, L. J.; Deng, M. C.; Hu, Z.; Hu, W. J.; Liang, S. P. Function and solution structure of Huwentoxin-X, a specific blocker of N-type calcium channels, from the Chinese bird spider Ornithoctonus huwena. J. Biol. Chem. 2006, 281 (13), 8628–8635. Kozlov, S.; Malyavka, A.; McCutchen, B.; Lu, A.; Schepers, E.; Herrmann, R.; Grishin, E. A novel strategy for the identification of toxin-like structures in spider venom. Proteins 2005, 59 (1), 131– 140. Yuan, C. H.; He, Q. Y.; Peng, K.; Diao, J. B.; Jiang, L. P.; Tang, X.; Liang, S. P. Discovery of a distinct superfamily of kunitz-type toxin (KTT) from tarantulas. PLoS One 2008, 3 (10), e3414. Kress, L. F.; Laskowski, M., Sr. The basic trypsin inhibitor of bovine pancreas: VII. Reduction with borohydride of disulfide bond linking half-cystine residues 14 and 38. J. Biol. Chem. 1967, 242 (21), 4925–4929. Bayrhuber, M.; Vijayan, V.; Ferber, M.; Graf, R.; Korukottu, J.; Imperial, J.; Garrett, J. E.; Olivera, B. M.; Terlau, H.; Zweckstetter, M.; Becker, S. Conkunitzin-S1 is the first member of a new Kunitztype neurotoxin family. J. Biol. Chem. 2005, 280 (25), 23766–23770. Dy, C. Y.; Buczek, P.; Imperial, J. S.; Bulaj, G.; Horvath, M. P. Structure of conkunitzin-S1, a neurotoxin and Kunitz-fold disulfide variant from cone snail. Acta Crystallogr., D: Biol. Crystallogr. 2006, 62 (Pt. 9), 980–990. Francischetti, I. M.; Valenzuela, J. G.; Andersen, J. F.; Mather, T. N.; Ribeiro, J. M. Ixolaris, a novel recombinant tissue factor pathway inhibitor (TFPI) from the salivary gland of the tick, Ixodes scapularis: identification of factor X and factor Xa as scaffolds for the inhibition of factor VIIa/tissue factor complex. Blood 2002, 99 (10), 3602–3612. Francischetti, I. M.; Mather, T. N.; Ribeiro, J. M. Penthalaris, a novel recombinant five-Kunitz tissue factor pathway inhibitor (TFPI) from the salivary gland of the tick vector of Lyme disease, Ixodes scapularis. Thromb. Haemostasis 2004, 91 (5), 886–898. Monteiro, R. Q.; Rezaie, A. R.; Ribeiro, J. M.; Francischetti, I. M. Ixolaris: a Factor Xa heparin-binding exosite inhibitor. Biochem. J. 2005, 387 (Pt 3), 871–877. MacLean, J. A., II; Roberts, R. M.; Green, J. A. Atypical Kunitz-type serine proteinase inhibitors produced by the ruminant placenta. Biol. Reprod. 2004, 71 (2), 455–463. Richardson, M.; Pimenta, A. M. C.; Bemquerer, M. P.; Santoro, M. M.; Beirao, P. S. L.; Lima, M. E.; Figueiredo, S. G.; Bloch, C., Jr.; Vasconcelos, E. A. R.; Campos, F. A.; Gomes, P. C.; Cordeiro, M. N. Comparison of the partial proteomes of the venoms of Brazilian spiders of the genus Phoneutria. Comp Biochem Physiol C. 2006, 142 (3-4), 173-187. Satake, H.; Villegas, E.; Oshiro, N.; Terada, K.; Shinada, T.; Corzo, G. Rapid and efficient identification of cysteine-rich peptides by random screening of a venom gland cDNA library from the hexathelid spider Macrothele gigas. Toxicon 2004, 44 (2), 149–156. Zhang, Y. Q.; Chen, J. J.; Tang, X.; Wang, F.; Jiang, L. P.; Xiong, X.; Wang, M. C.; Rong, M. Q.; Liu, Z. H.; Liang, S. P. Transcriptome analysis of the venom glands of the Chinese wolf spider Lycosa singoriensis. Zoology (Jena) 2009, 113 (1), 10-18.

Journal of Proteome Research • Vol. 9, No. 5, 2010 2563

research articles (48) Shu, Q.; Lu, S. Y.; Gu, X. C.; Liang, S. P. The structure of spider toxin huwentoxin-II with unique disulfide linkage: evidence for structural evolution. Protein Sci. 2002, 11 (2), 245–252. (49) Liang, S. P. Proteome and peptidome profiling of spider venoms. Expert Rev. Proteomics 2008, 5 (5), 731–746. (50) Escoubas, P.; Sollod, B.; King, G. F. Venom landscapes: mining the complexity of spider venoms via a combined cDNA and mass spectrometric approach. Toxicon 2006, 47 (6), 650–663. (51) Escoubas, P.; Quinton, L.; Nicholson, G. M. Venomics: unravelling the complexity of animal venoms with mass spectrometry. J. Mass Spectrom. 2008, 43 (3), 279–295. (52) Krapcho, K. J.; Kraljr, R. M.; Vanwagenen, B. C.; Eppler, K. G.; Morgan, T. K. Characterization and cloning of insecticidal peptides from the primitive weaving spider Diguetia canities. Insect Biochem. Mol. Biol. 1995, 25 (9), 991–1000. (53) Binford, G. J.; Cordes, M. H.; Wells, M. A. Sphingomyelinase D from venoms of Loxosceles spiders: evolutionary insights from cDNA sequences and gene structure. Toxicon 2005, 45 (5), 547– 560. (54) Danilevich, V. N.; Grishin, E. V. The chromosomal genes for black widow spider neurotoxins do not contain introns. [Article in Russian]. Bioorg. Khim. 2000, 26 (12), 933–939. (55) Escoubas, P. Molecular diversification in spider venoms: a web of combinatorial peptide libraries. Mol. Diversity 2006, 10 (4), 545– 554. (56) Norton, R. S.; Pallaghy, P. K. The cystine knot structure of ion channel toxins and related polypeptides. Toxicon 1998, 36 (11), 1573–1583. (57) Craik, D. J.; Daly, N. L.; Waine, C. The cystine knot motif in toxins and implications for drug design. Toxicon 2001, 39 (1), 43–60. (58) Wang, X.; Connor, M.; Smith, R.; Maciejewski, M. W.; Howden, M. E.; Nicholson, G. M.; Christie, M. J.; King, G. F. Discovery and characterization of a family of insecticidal neurotoxins with a rare vicinal disulfide bridge. Nat. Struct. Biol. 2000, 7 (6), 505–513. (59) Shu, Q.; Liang, S. P. Purification and characterization of huwentoxin-II, a neurotoxic peptide from the venom of the Chinese bird spider Selenocosmia huwena. J. Pept. Res. 1999, 53 (5), 486–491.

2564

Journal of Proteome Research • Vol. 9, No. 5, 2010

Tang et al. (60) Espiritu, D. J.; Watkins, M.; Dia-Monje, V.; Cartier, G. E.; Cruz, L. J.; Olivera, B. M. Venomous cone snails: molecular phylogeny and the generation of toxin diversity. Toxicon 2001, 39 (12), 1899–1916. (61) Conticello, S. G.; Gilad, Y.; Avidan, N.; Ben-Asher, E.; Levy, Z.; Fainzilber, M. Mechanisms for evolving hypervariability: the case of conopeptides. Mol. Biol. Evol. 2001, 18 (2), 120–131. (62) Sollod, B. L.; Wilson, D.; Zhaxybayeva, O.; Gogarten, J. P.; Drinkwater, R.; King, G. Were arachnids the first to use combinatorial peptide libraries. Peptides 2005, 26 (1), 131–139. (63) He, Q. Y.; He, Q. Z.; Deng, X. C.; Yao, L.; Meng, E.; Liu, Z. H.; Liang, S. P. ATDB: a uni-database platform for animal toxins. Nucleic Acids Res. 2008, 36 (Database issue), D293–297. (64) Kaiser, I. I.; Griffin, P. R.; Aird, S. D.; Hudiburg, S.; Shabanowitz, J.; Francis, B.; John, T. R.; Hunt, D. F.; Odell, G. V. Primary structures of two proteins from the venom of the Mexican red knee tarantula (Brachypelma smithii). Toxicon 1994, 32 (9), 1083–1093. (65) Savel-Niemann, A. Tarantula (Eurypelma californicum) venom, a multicomponent system. Biol. Chem. Hoppe-Seyler 1989, 370 (5), 485–498. (66) Oswald, R. E.; Suchyna, T. M.; McFeeters, R.; Gottlieb, P.; Sachs, F. Solution structure of peptide toxins that block mechanosensitive ion channels. J. Biol. Chem. 2002, 277 (37), 34443–34450. (67) Diochot, S.; Drici, M.-D.; Moinier, D.; Fink, M.; Lazdunski, M. Effects of phrixotoxins on the Kv4 family of potassium channels and implications for the role of Ito1 in cardiac electrogenesis. Br. J. Pharmacol. 1999, 126 (1), 251–263. (68) Stapleton, A.; Blankenship, D. T.; Ackermann, D. L.; Chen, T.-M.; Gorder, G. W.; Manley, G. D.; Palfreyman, M. G.; Coutant, J. E.; Cardin, A. D. Curtatoxins. Neurotoxic insecticidal polypeptides isolated from the funnel-web spider Hololena curta. J. Biol. Chem. 1990, 265 (4), 2054–2059. (69) Skinner, W. S.; Adams, M. E.; Quistad, G. B.; Kataoka, H.; Cesarin, B. J.; Enderlin, F. E.; Schooley, D. A. Purification and characterization of two classes of neurotoxins from the funnel web spider, Agelenopsis aperta. J. Biol. Chem. 1989, 264 (4), 2150–2155.

PR1000016