Mass Spectrometry-Based Sequencing and SRM-Based

No biomarker has yet been discovered to identify the reproductive status of the endangered leatherback sea turtle (Dermochelys coriacea). Although vit...
1 downloads 0 Views 2MB Size
Article pubs.acs.org/jpr

Mass Spectrometry-Based Sequencing and SRM-Based Quantitation of Two Novel Vitellogenin Isoforms in the Leatherback Sea Turtle (Dermochelys coriacea) Marine I. Plumel,†,§,∥ Thierry Wasselin,†,§,∥ Virginie Plot,‡,§ Jean-Marc Strub,†,§ Alain Van Dorsselaer,†,§ Christine Carapito,†,§ Jean-Yves Georges,‡,§ and Fabrice Bertile†,§,* †

Département Sciences Analytiques, Université de Strasbourg, IPHC, 25 rue Becquerel, 67087 Strasbourg, France Département Ecologie, Physiologie et Ethologie, Université de Strasbourg, IPHC, 23 rue Becquerel, 67087 Strasbourg, France § CNRS, UMR7178, 67087 Strasbourg, France ‡

S Supporting Information *

ABSTRACT: No biomarker has yet been discovered to identify the reproductive status of the endangered leatherback sea turtle (Dermochelys coriacea). Although vitellogenin (VTG) could be used for this, its sequence is not known in D. coriacea and no quantitative assay has been carried out in this species to date. Using de novo sequencing-based proteomics, we unambiguously characterized sequences of two different VTG isoforms that we named Dc-VTG1 and Dc-VTG2. To our knowledge, this is the first clear evidence of different VTG isoforms and the structural characterization of derived yolk proteins in reptiles. This work illustrates how massive de novo sequencing can characterize novel sequences when working on “exotic” nonmodel species in which even nucleotide sequences are not available. We developed assays for absolute quantitation of these two isoforms using selected reaction monitoring (SRM) mass spectrometry, thus providing the first SRM assays developed specifically for a nonsequenced species. Plasma levels of Dc-VTG1 and Dc-VTG2 decreased as the nesting season proceeded, and were closely related to the increased levels of reproductive effort. The SRM assays developed here therefore provide an original and efficient approach for the reliable monitoring of reproduction cycles not only in D. coriacea, but potentially in other turtle species. KEYWORDS: de novo sequencing, selected reaction monitoring (SRM), targeted proteomics, vitellogenin, isoform identification, egg yolk proteins, sea turtle reproduction



INTRODUCTION

conservation measures target individuals that are either beginning or ending their season. Vitellogenin (VTG) is a large lipoglycophosphoprotein, which is produced and secreted by the liver of oviparous animals under estrogen stimulation then processed into yolk proteins that are incorporated in the growing oocytes.3,4 VTG has already been used to assess reproductive status and detect evidence of egg production in some oviparous animals. Seasonal fluctuations in plasma VTG levels have already been found in some birds5 and turtles (mostly terrestrial species),6−11 linking plasma VTG to plasma estradiol levels and/or indicating whether females are engaged in reproduction or not. Thus, although nothing is known regarding relationships between circulating VTG levels and reproductive status in sea turtles, we assumed that in D. coriacea, VTG could be a relevant biomarker of not only individual seasonal reproductive status, but also the level of

The survival of the critically endangered leatherback sea turtle (Dermochelys coriacea) depends on a better understanding of its reproductive biology and ultimately its population dynamics.1 Measurements taken during beach patrols make it possible to monitor the population during the nesting season, when breeding females come regularly ashore to lay eggs on sandy beaches. However, huge efforts are required in the field to individually and accurately monitor nesting activity in these species due to the long duration of the nesting season (up to several months) in sea turtles,2 their nocturnal nesting behavior, and the wide range of commonly unconnected nesting sites. As a consequence, it is virtually impossible to assess the degree of advancement of the nesting season for a given individual during a beach patrol. Yet this information is crucial for numerous players in turtle conservation such as scientists investigating the determinants of reproduction or targeting individuals at a given step of their reproduction, and managing bodies when specific © 2013 American Chemical Society

Received: May 10, 2013 Published: July 10, 2013 4122

dx.doi.org/10.1021/pr400444m | J. Proteome Res. 2013, 12, 4122−4135

Journal of Proteome Research

Article

quantitative assay to establish the kinetics of circulating VTG abundance throughout the nesting season.

reproductive effort as the nesting season proceeds. However, no quantitative method has been developed to unravel this relationship and investigate time-course changes in plasma VTG concentrations in D. coriacea. An additional major bottleneck is that there is no available protein sequence in this species, whose genome has yet to be sequenced. Immunoassays have already been used to quantify VTG in some turtle species (mainly terrestrial).6,8−11 As evolutionary divergence would probably have caused poor interspecific cross reactivity of antibodies developed in these previous studies, we decided to base our analytical strategy on the use of current mass spectrometry-based proteomics approaches. During the past decade, mass spectrometry has been established as the definitive tool for the study of primary structure in proteins.12 Typically, after protein extraction from a given biological source, proteins are separated by SDS−PAGE. Gel slices are excised from gel lanes and in-gel digestion is usually performed with trypsin. Numerous other enzymes generating different and often overlapping peptides are also available for this step, and the use of multiple enzymes often increases sequence coverage.13 After digestion, the peptide mixture is analyzed by liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS). MS/MS data are then searched against protein databases using database-searching algorithms. Such programs identify peptides on a non error tolerant basis, that is, when their experimentally measured mass corresponds to the mass of a theoretical peptide sequence generated in silico from existing databases. However, as mutations are the very base of species diversity, tolerating differences between experimental and theoretical peptide masses/sequences drastically improves sequence characterization of proteins from still nonsequenced species. Error tolerant searches should therefore be performed using tools like the Mascot program.14 Moreover, de novo sequencing is another particularly well-adapted strategy to characterize proteins from nonsequenced species.15 It consists of determining amino acid sequence tags directly from experimental MS/MS spectra without comparing to existing data. These sequence tags can then be used to identify homologous peptides by searching databases using sequence similarity search algorithms (BLAST type). Quantitative targeted proteomics has recently taken front stage in the proteomics community. Within the context of hypothesis-driven proteomics, selected reaction monitoring (SRM) has emerged as a powerful MS technology for determining the absolute or relative abundance of a given protein.16,17 Very good correlation has been observed between SRM and different, well-established quantification methods (ELISA for instance).18 The specificity (multiple “epitope” peptides quantified per target protein), the sensitivity (∼hundreds of amols within a 4-log dynamic range) of the SRM approach and the possibility of simultaneous multiplexed analyte measurement all compare favorably to protein quantification by ELISA. Several proteotypic peptides, that is, specific to only one protein and suitable for MS analysis, are usually chosen for each target protein. Their chromatographic separation is optimized and parent/fragment ion couples (transitions) are measured on a QQQ mass spectrometer. Absolute quantitation is obtained by a combination of SRM with isotope dilution, isotopically labeled synthetic peptides being added to the samples as internal standard peptides prior to LC-SRM-MS analysis. Our objectives were therefore to characterize the sequence of VTG in the leatherback sea turtle D. coriacea using LC-MS/MS, to set up a VTG-targeted SRM assay, and to use this SRM



MATERIALS AND METHODS

Chemicals

Modified porcine trypsin was obtained from Promega (Madison, WI), and chymotrypsin and Asp-N from Roche Diagnostics (Mannheim, Germany). High-quality isotopically labeled standard peptides (HeavyPeptide AQUA Ultimate; 5 pmol/μL, ± 5%) for LC-SRM assays were synthetized by Thermo Fisher Scientific (Bremen, Germany). C18 Sep-Pak cartridges (Sep-Pak Vac 1cc50 mg, tC18) were obtained from Waters (Milford, MA). Methanol and phosphoric acid were purchased from Fischer scientific (Loughborough, United Kingdom), and all other reagents and chemicals were purchased from Sigma Aldrich (St. Louis, MO). All buffers were prepared with Milli-Q water. Analysis of Available VTG Sequences

All the VTG sequences in the NCBI protein database (www.ncbi. nlm.nih.gov) available in January 2011 were analyzed to ensure the best conditions for comparison of our mass spectrometry results for a still nonsequenced VTG with what is already known in any other species. The term “vitellogenin” was first used as a keyword to identify any entries in the NCBI protein database that corresponded to turtle vitellogenin (or vitellogenin fragment) sequences (thereafter referred to as “turtle VTGs”). To identify the protein sequences that were the most similar to those of turtle VTGs in the NCBInr database, we then used the NCBI BLAST interface (http://www.ncbi.nlm.nih.gov/BLAST/). The sequences of the 99 first blast hits for each turtle VTG were then downloaded and concatenated in a single file. Redundant entries were removed before multiple sequence alignment and construction of a phylogenetic tree using CLUSTAL W (v.2.0),19 then a phylogram was edited using TreeView software (v.1.6.6).20 A sequence identity matrix (Identify matrix) showing the proportion of identical residues between all sequences in the alignment was generated using BioEdit software (v.7.0.5.2).21 We then performed the multiple alignment and construction of a phylogenetic tree, and generated a sequence identity matrix following the same method as above, but using only some VTG sequences sharing the same cluster as turtle VTGs. When comparing entries that corresponded to full and fragment sequences, sequence identities were calculated only from subsections of alignments that corresponded to the shorter sequence in order to avoid calculating erroneously low identity scores. Animals and Sample Collection

This study benefitted from a long-term field protocol conducted at Awala-Yalimapo (5.7°N, 53.9°W), French Guiana, South America, the world’s major nesting site for leatherback turtles (D. coriacea).22 In French Guiana, where leatherback females breed every 2 or 3 years, individuals each lay an average of ∼8 clutches per season, typically at 10 day intervals, returning to sea between two successive laying events.23 This makes it possible to monitor each nesting event throughout the reproductive season of each individual turtle, provided that appropriate field protocols are implemented. As part of the MIRETTE project (http:// projetmirette.fr), daily nocturnal beach patrols (from 18:00 to 07:00) are performed throughout the nesting season (from March to July) to individually identify all female leatherbacks coming ashore via the internal passive integrated transponder tags (PIT, Trovan Euroid) in their shoulder.22 Some of these 4123

dx.doi.org/10.1021/pr400444m | J. Proteome Res. 2013, 12, 4122−4135

Journal of Proteome Research

Article

Figure 1. Experimental workflow. To sequence the leatherback turtle (Dermochelys coriacea) vitellogenin (VTG), mass-spectrometry-based experiments were conducted on protein extracts from the egg yolk. Proteotypic peptides were specified from the identified sequence tags. For each proteotypic peptide, three transitions were chosen to establish a quantitative method based on selected reaction monitoring (SRM). SRM was developed to enable high-throughput assays from plasma samples and determine time-course changes in VTG concentrations throughout the nesting season.

Departmental Direction of the Veterinary Services (Strasbourg, France) and Police Prefecture of Bas-Rhin.

females were selected for exhaustive individual monitoring that involved the performance of measurements and blood sampling at each nesting event (for details, see ref 24). In short, this individual monitoring consisted of (i) recording identity, date of oviposition and body morphometrics, (ii) counting yolked eggs and sampling three yolked eggs from each clutch, (iii) sampling blood during oviposition, and (iv) weighing females after oviposition when they began returning to the sea. The eggs were individually placed in a numbered hermetic plastic bag and stored in a refrigerated cool box until the patrol was finished before being frozen (−20 °C). Blood samples (6 mL) were collected from the femoral rete system, using a syringe. The blood was immediately transferred into heparinised polypropylene microtubes and placed in a refrigerated cool box until the patrol was finished. Blood samples were centrifuged no more than 4 h after collection to separate plasma and blood cells. Samples were then frozen (−20 °C) until they could be analyzed at the IPHC, Strasbourg, France. This sequence of manipulations ensured minimal disturbance of the animals, as confirmed by direct observations of all manipulated turtles completing their oviposition and returning to lay subsequent clutches. Thanks to our exhaustive night patrols, we defined the level of reproductive effort (LRE), adapted from Hamann et al.,25 as an index of the seasonal progression for each individual turtle. For a given clutch (x) laid by a given turtle, LRE was calculated as: LRE at clutch (x) = 100(x/total number of clutches laid by this turtle). LRE provides a better proxy of the relative reproductive effort through time compared to clutch rank (see ref 24 for details). In order to test time-course changes in plasma VTG throughout the nesting season, this study used data and samples from two individual leatherback turtles that were manipulated for almost all successive clutches over a long nesting season. The field protocols respected the legal requirements of the country in which the work was carried out and followed all institutional guidelines. This study was carried out under CNRSIPHC institutional license (B67 482 18) and under individual license to JYG (67-220), both of which were delivered by the

Experimental workflow

The experimental workflow considered mass spectrometrybased experiments at both sequence determination and quantitative assay levels (Figure 1). NanoLC-MS/MS analysis was performed on egg yolk samples after protein extraction and digestion to precisely characterize VTG sequence tags. Then proteotypic peptides were selected and parent/fragment ion couples were determined to develop a high-throughput quantitative assay from plasma samples via selected reaction monitoring (SRM). Mass Spectrometry-Based Experiments for Turtle VTG Sequence Determination

Egg Yolk Protein Extraction. The egg was thawed at room temperature prior to yolk collection. 1 mL of yolk was delipidated using 20 mL of 2:1 chloroform:methanol (v/v). After 2 h at room temperature, 4.2 mL of KCl (0.37M) were added to the sample which was then maintained at 4 °C overnight. After centrifugation (10 min, 2000g, 4 °C), the protein phase was collected and proteins were precipitated using acetone, then dried and resuspended in lysis buffer (10 mM Tris pH 8.8, 1 mM EDTA, 3% SDS). After sonication, the protein concentration was determined using a DC protein assay kit purchased from Bio-Rad (Hercules, CA). Egg yolk proteins were then subjected to SDSPAGE analysis. 1-D SDS-PAGE. SDS-PAGE was carried out using 8−12% SDS-gradient polyacrylamide gels in a Protean II cell (Biorad, Hercules, CA). After boiling for 5 min in a denaturating buffer (10 mM Tris pH8, 1 mM EDTA, 5% SDS, 5% βmercaptoethanol, 10% glycerol, and 0.1% bromophenol blue), 11 μg of egg yolk proteins were loaded per lane and electrophoresed by increasing current steps (1 h at 10 mA followed by 18.5 h at 16 mA). The gels were stained by a colloidal blue method (G250, Fluka, Buchs, Switzerland). Gel images were scanned with a GS800 calibrated densitometer (400 dpi; Biorad). 4124

dx.doi.org/10.1021/pr400444m | J. Proteome Res. 2013, 12, 4122−4135

Journal of Proteome Research

Article

MS data were interpreted following three different search strategies. First, we used a local Mascot server (version 2.2., Matrix Science, London, UK). Spectra were searched with a mass tolerance of 30 ppm in MS and of 0.3 Da in MS/MS mode, allowing a maximum number of missed cleavages: 1 for trypsin, 3 for AspN and 4 for chymotrypsin. Carbamidomethylation of cysteine and oxidation of methionine residues were specified as variable modifications. Following the guidelines for proteomic data publication,26,27 and to avoid considering poor quality data, filtering criteria based on probability-based scoring of the identified peptides were taken into account for high confidence identifications and a decoy database was generated and searched to determine discovery rates in protein identification.28 The Scaffold software (v.3.00.07, Proteome software Inc., Portland, OR) was used to validate identifications using the following criteria: spectra were searched against a target-decoy version of the NCBInr database created using an in-house developed tool (https://msda.unistra.fr/). This database was restricted to Chordata (Taxonomy ID: 7711; sequences downloaded in January 2011; 2760408 target+decoy entries). Protein identifications after trypsin and AspN digestion were validated when MS/MS ion scores were greater than Mascot’s identity scores (95% confidence level). Protein identifications after chymotrypsin digestion were validated when MS/MS ion scores were greater than 0 and MS/MS ion scores minus Mascot’s identity scores (95% confidence level) were greater than −5. It is of note that common contaminants (such as keratins and trypsin) were removed from the identification tables. On the basis of the list of VTG homologues obtained previously through the phylogenetic and sequence analysis (see above) all vitellogenin-related sequences were manually selected to be included in an additional Mascot error tolerant search, where search parameters were as follows: no enzyme and mass tolerance of 30 ppm in MS and of 0.3 Da in MS/MS mode. This was done to identify possible sequence variants which may originate from evolutionary changes. The peptides identified with any variable modification but carbamidomethylation of cysteine and oxidation of methionine residues were not considered. In addition to non error tolerant and error tolerant Mascot searches, de novo sequencing was performed. PEAKS Studio software (v.4.5, Bioinformatics Solutions, Waterloo, Canada) was used to determine complete or partial amino acid sequences from the MS/MS spectra. The deduced sequence tags with at least six amino acids were then submitted to the MS-BLAST (Basic Local Alignment Search Tool) program provided on the EMBL site (dove.embl-heidelberg.de/Blast2/msblast.html) in order to identify proteins by homology with the proteins present in a database downloaded from the NCBInr, and restricted to all entries retrieved when using the term vitellogenin as a keyword (sequences downloaded in January 2011; 1176 entries). For the resulting peptides satisfying the MS-BLAST procedure for statistical evaluation of MS-BLAST hits (dove.embl-heidelberg. de/Blast2/MS_BLAST_TIPS.html), all corresponding spectra were manually inspected prior to validation. In some cases (8%), discrepancies between some peptide sequence tags occurred. This was the case either when distinct identified sequence tags were assigned to the same region of the same entry or when alignments revealed differences between some sequence tags that were assigned to different entries. To discriminate between these tags, an additional Mascot search was performed against a newly generated protein database, which contained several copies of the sequence of the involved entry,

The relative molecular mass of protein bands was determined by comparison to 6.5−205 kDa molecular mass markers (Sigma Aldrich, St. Louis, MO). For a protein band determined at a given molecular mass, five slices (in five different lanes) were excised manually for subsequent in-gel digestions and mass spectrometry-based analyses. In-Gel Digestions. After in-gel reduction and alkylation using the MassPrep Station (Waters, Milford, MA), the excised protein bands were treated using one of five different digestion procedures: (1) trypsin overnight, (2) chymotrypsin for 6 h, (3) chymotrypsin overnight, (4) chymotrypsin for 4 h followed by trypsin for 3h, and (5) AspN overnight. All digestions were performed at 37 °C and the resulting peptides were extracted (60% acetonitrile, 0.1% formic acid) prior to mass spectrometry analyses. NanoLC-MS/MS Analyses. Digestion peptides were analyzed on a nanoACQUITY UltraPerformance LC (UPLC) system coupled to a Synapt HDMS G1 Q-Tof (Waters, Milford, MA) equipped with a Z-spray ion source and a lock mass system. The autosampler was maintained at a temperature of 4 °C for the duration of the analysis. The peptides were trapped on a nanoAcquity UPLC precolumn (Symmetry C18 Trap, 5 μm, 180 μm × 20 mm, Waters), then separated on a nanoAcquity UPLC column (BEH C18, 1.7 μm, 75 μm × 150 mm, Waters). The solvent system consisted of 0.1% formic acid in water (solvent A) and 0.1% formic acid in acetonitrile (solvent B). Trapping was performed for 3 min at a flow rate of 5 μL.min−1, with 99% of solvent A and 1% of solvent B. Elution was performed at a flow rate of 0.4 μL·min−1, using a linear gradient of 1−50% B over 35 min at 45 °C, followed by 50% B over 1 min and then 90% B over 5 min. For optimal UPLC-HDMS analysis, the source temperature was set at 80 °C, the desolvation gas temperature at 190 °C and desolvation gas flow at 500 L·h−1. The system was operated in positive mode using the following settings: the capillary voltage was set at 3.5 kV, the sample cone voltage at 35 V, and the extraction cone voltage at 4.0 V. The TOF was calibrated using Glu-fibrino-peptide B on the 50−2000 m/z range. Online correction of this calibration was performed with Glu-fibrinopeptide B as the lock-mass: the ion (M+2H)2+ at m/z 785.8426 was used to calibrate MS data and the ion (M+H)+ at m/z 684.3469 was used to calibrate MS/MS data. This was done via a lock spray interface, with a lock spray frequency set at 30 s. For tandem MS experiments, automatic switching between MS and MS/MS modes was set as follows: the three most abundant peptides (intensity threshold: 40 count·sec−1), preferably doubly and triply charged ions, were selected on each MS spectrum for further isolation. Collision-induced dissociation (CID) fragmentation using argon as collision gas was performed using 2 different energies (from m/z 300 to 500: 14 and 18 eV; from m/z 501 to 600: 19 and 24 eV; from m/z 601 to 700: 24 and 28 eV; from m/z 701 to 800: 28 and 32 eV; from m/z 801 to 900: 32 and 39 eV; from m/z 901 to 1000: 39 and 45 eV; from m/z 1001 to 1200: 45 and 55 eV; from m/z 1201 to 1700: 55 and 60 eV), which were set using collision energy profile. For MS and MS/ MS acquisitions, the scan range was respectively from m/z 250 to 1500 and m/z 50 to 2000, with a scan time of 0.5 s. in MS and 0.7 s. in MS/MS mode. The system was fully controlled by the MassLynx software (v.4.1., Waters). MS/MS Data Interpretation. Data from nanoLC-MS/MS analyses were processed using the ProteinLynx Global Server software (v.2.3, Waters) for background subtraction (normal type both for MS and MS/MS with 5% threshold and polynomial correction of order 5) and deisotoping. For identification, MS/ 4125

dx.doi.org/10.1021/pr400444m | J. Proteome Res. 2013, 12, 4122−4135

Journal of Proteome Research

Article

Linearity Study. Increasing amounts of peptides were spiked from the concentration-balanced mixture of IS peptides in a randomly chosen plasma sample. MicroLC-SRM analysis was carried out (see below for details on sample preparation and analysis parameters). Calibration curves were then established by plotting the mean chromatographic peak area (using the sum of peak areas of the three selected transitions) from triplicate injections against the injected amount of IS peptides (11.3 fmol to 0.4 pmol for QELTLVEVK; 22.6 fmol to 0.9 pmol for VASPDTLESVFK; 9.7 to 0.4 pmol for IQLEIQAGSR; 143.6 fmol to 1.7 pmol for ISSEFVTGR; 3.2 fmol to 1.2 pmol for MTPVLLPEAVPDIMK). Linearity criteria required experimental dots in standard curves to exhibit an average CV precision that was below 20% among triplicate injections and between response factors (AUC normalized by concentration) for all transitions. Experimental dots also had to fall within the average 80−120% accuracy range in calculating expected injected amounts using regression equations. Limit of detection (LOD) was defined as the lowest point with transition signal-to-noise ratios over 4, and limit of quantitation (LOQ) as the lowest point satisfying all the above-reported criteria. Only the points satisfying all these criteria were used to calculate the linear regression equations and correlation coefficients. Plasma Sample Preparation. The concentration-balanced mixture of IS peptides was spiked into crude plasma samples. Analyses were performed using 1.7 μL of plasma diluted in 200 μL of an 8 M Urea/0.1 M NH4HCO3 solution. Samples were reduced using dithiothreitol (37 °C, 30 min, 12 mM) and alkylation was carried out with iodoacetamide (25 °C, 60 min, 40 mM). They were then diluted to 1 M urea using a 0.1 M NH4HCO3 solution and digestion was performed O/N at 37 °C using a trypsin:protein ratio of 1:100. After acidification with HCOOH, desalting and further concentration was carried out using C18 Sep-Pak cartridges. MicroLC-SRM Analyses. Digestion peptides were analyzed on an Agilent 1100 HPLC system coupled to a G6410B Triple Quadrupole. A 0.05 μL equivalent of plasma sample was injected, within which the injected amount of IS peptides was as follows: 33.9 fmol for QELTLVEVK, 67.7 fmol for VASPDTLESVFK, 29.1 fmol for IQLEIQAGSR, 130.7 fmol for ISSEFVTGR and 19.3 fmol for MTPVLLPEAVPDIMK. The peptides were trapped on a precolumn (Zorbax C18 stable bond, 5 μm, 1.0 × 17 mm, Agilent Technologies) then separated on a C18 column (Zorbax 300 SB C18, 3.5 μm, 150 × 0.3 mm, Agilent Technologies). Two solvents were used: 0.1% formic acid/2% acetonitrile in water (solvent A) and 0.1% formic acid/2% water in acetonitrile (solvent B). Trapping was performed for 5 min at a flow rate of 50 μL·min−1 with 0.1% formic acid in water. Elution was performed at a flow rate of 5 μL.min−1 using the following optimized gradient: after 5 min at 8% B: from 8% to 11% B in 4 min.; 4 min at 11%B; from 11% to 17% B in 5 min.; 1 min at 17% B; from 17% to 40% B in 25 min.; from 40% to 70% B in 1 min.; then 2 min at 70% B. For optimal microLC-SRM, the nitrogen drying gas temperature was set at 300 °C and gas flow at 360 L.h−1, and the nebulizer pressure was maintained at 15 psi. The system was operated in positive mode and the transmission capillary voltage was set at −4000 V. Four time segments were defined (0−5; 5−15; 15−20; 20−26.5 min). Unit 0.7/Wide 1.2 resolutions were used in Q1/Q3, respectively, and a dwell time of 160 ms was used for each transition. Detector amplification voltage was set at 400 V. The system was fully controlled by the MassHunter Workstation Data Acquisition software (v.B.03.01, Agilent Technologies).

that is, one in its native form and others in which we substituted the considered amino acids by the distinct identified sequence tags. We also added to the protein database all the sequence entries (native and substituted) to which at least one of the conflicting tags had previously been assigned. This allowed sequences to be corrected. To determine if these corrected sequences were more similar to a VTG1 or VTG2 type, they were then searched against the vitellogenin database previously used (see above) using MS-BLAST. MicroLC-SRM Quantitation of Plasma Dc-VTG1 and Dc-VTG2

Target Peptide Selection. From the peptides identified in the first part of this study, only those that were identified through Mascot searches were retained for the microLC-SRM quantitation of Dc-VTG1 and Dc-VTG2. In addition, we only retained tryptic peptides that exhibited the most intense MS signals. As there was no available protein sequence database for D. coriacea, proteotypicity of these selected peptides was approximated by submitting their sequence to the NCBI BLAST interface. Searches were performed against a database restricted to chordata (Taxonomy ID: 7711; sequences downloaded in November 2012; 2 291 700 entries). The peptides that were finally retained for SRM were those for which the obtained blast hits corresponded, with strictly identical residues to the entire submitted sequence, either only to avian VTG1 homologues (for Dc-VTG1) or only to avian VTG2 and/or turtle VTG homologues (for Dc-VTG2). Hence, the SRM quantification of Dc-VTG1 and Dc-VTG2 was performed using 1 and 4 specific peptides (reported on sequence alignments in Supporting Information (SI) Figures 3−4), respectively. As the specific surrogate peptide retained for Dc-VTG1 bears methionines, we first verified whether oxidized forms were detectable in all plasma samples. After plasma protein extraction, digestion peptides were analyzed by microLC-SRM-MS (data not shown; see below for details on analysis parameters). Transition signal-to-noise ratios were always lower than 4 for oxidized forms while always higher than 100 for the nonmodified form. Consequently, endogenous oxidized peptides were not taken into account for quantitation. Transition Selection and Concentration Adjustments of the Internal Standard to Endogenous Levels. Only precursors and fragments with at least 5 amino acids were retained to ensure the specificity of peptide response. In order to select the three best transitions and optimize the instrument parameters (interface transmission voltages and collision energies), internal standard (IS) peptides were infused individually on an Agilent G6410B Triple Quad (Agilent Technologies, Palo Alto). Then, 2−200 pmol of peptides from an equimolar mix of IS peptides were spiked in a randomly chosen plasma sample, and heavy/light signal intensity ratios were calculated for each peptide using the sum of peak areas of the three selected transitions (see below for details on sample preparation and analysis parameters). This allowed us to determine retention times, verify coelution of heavy and light peptides, and estimate the concentration range of endogenous peptide targets. To adjust the concentration of IS peptides to that of their endogenous counterparts, a concentration-balanced mixture of heavy labeled standard peptides was prepared in an 8 M Urea/0.1 M NH4HCO3 solution and used to spike plasma samples for studying linear dynamic ranges, thus providing more accurate quantitation of endogenous peptide targets based on the peak area ratios between endogenous and IS peptides. 4126

dx.doi.org/10.1021/pr400444m | J. Proteome Res. 2013, 12, 4122−4135

Journal of Proteome Research

Article

MicroLC-SRM Data Treatment and Quality Controls. The Skyline open-source software package29 was used to integrate transition peak areas. To ensure the overall reproducibility of the experiment, light/heavy ratios were calculated from individual transition areas, and we verified that coefficients of variation were always less than 20% between triplicate preparations of spiked plasmas. The latter criterion was satisfied for all measurements, except for those concerning one transition of the QELTLVEVK peptide in two plasmas and one transition of the MTPVLLPEAVPDIMK peptide in one plasma, for which CVs were all between 21 and 23%. LC-SRM data for these three transitions were therefore removed from the experiment. Light/heavy ratios were then averaged for all the transitions of a given peptide in a given plasma and we controlled again for low values of coefficients of variation (CV < 20%). At this stage, all values met this criterion. These mean light/heavy ratios were finally multiplied by the known amount of injected IS peptides to calculate injected amounts of target (light) peptides. We were then able to determine initial concentrations of these endogenous peptides in plasma samples. Quality control of the experiment was performed by examining the stability of the SRM-MS signal over time. A randomly chosen sample spiked with a fixed amount of IS peptides was repeatedly injected (four times) over the course of the experiment. For each transition (heavy and light), coefficients of variation were calculated for areas obtained during the four repeated injections, and we set the acceptance level for CVs below 20%. Finally, we also checked that absolute areas obtained for heavy transitions in all spiked samples were similar (CVs < 20%) to those recorded in the spiked plasma used for linearity study, thus attesting that no global sensitivity loss occurred and that the stability of the system had been maintained over the course of the experiment.



entries considered were sequence fragments, and were therefore much shorter than the sequences of other entries. As such differences in the length of entry sequences drastically reduce identity scores, sequence identities were calculated only from subsections of alignments corresponding to the shortest sequence. Intra- and interspecies comparison revealed that sequence identity was ∼80−100% between VTG entries corresponding to the same entry type, while it was usually less than 40% between VTG entries corresponding to different entry types. Interspecies comparison also revealed that the three known sequences (or fragment) of turtle VTGs were more similar to those from the avian VTG2 (∼55−70% identity) than avian VTG1 (∼35−40% identity) and amphibian VTG (∼50%) types. Mass Spectrometry-Based Sequence Determination and Multiple Alignments Analysis

A dozen bands were distinguished from 1-D SDS-PAGE separation of egg yolk proteins, (SI Figure 2), of which the eight most intense, ranging between ∼30 and 115 kDa, were excised for further analysis. From nanoLC-MS/MS analyses, a total of 332 unique peptides were identified (see complete information in Table 1 and SI Table 2). A false discovery rate of identification