A Simple Protocol To Routinely Assess the Uniformity of Proteomics

Mar 11, 2014 - It relies on a simple protocol based on three proteins and two sets of ... LC–MS platform performance, by monitoring the main LC–MS...
0 downloads 0 Views 944KB Size
Technical Note pubs.acs.org/jpr

A Simple Protocol To Routinely Assess the Uniformity of Proteomics Analyses Sebastien Gallien,†,§ Adele Bourmaud,†,‡,§ and Bruno Domon*,†,‡ †

Luxembourg Clinical Proteomics Center (LCP), CRP-Santé, L-1445 Strassen, Luxembourg Doctoral School in Systems and Molecular Biomedicine, University of Luxembourg, L-1511 Luxembourg, Luxembourg



S Supporting Information *

ABSTRACT: Mass-spectrometry-based proteomic approaches are increasingly applied to biological and clinical studies. Initially used by specialized laboratories, the technology has matured and gained acceptance by the community, using various analytical processes and platforms. To facilitate data comparison and integration across laboratories, there is a need to harmonize analytical processes to ensure the generation of reliable proteomic data sets. This is especially critical in the context of large initiatives, such as the Human Proteome Project promoted by the Human Proteome Organization (HUPO). Quality control is a first step toward the harmonization of proteomics data sets. We have developed a procedure to routinely assess the uniformity of proteomics analyses. It relies on a simple protocol based on three proteins and two sets of isotopically labeled peptides, one being added prior to tryptic digestion and the second one prior to liquid chromatography−mass spectrometry (LC−MS) analysis. The proposed method evaluates in a single step both the sample preparation, by measuring the relative amounts of endogenous peptides and their isotopically labeled counterparts, and the LC−MS platform performance, by monitoring the main LC−MS attributes for reference peptides. The procedure is simple and easy to implement into routine workflows typically employed by the proteomics community. KEYWORDS: quality control, LC−MS/MS, SRM, PRM, SIM, HR/AM, sample preparation



INTRODUCTION Numerous proteomics studies are performed worldwide using various LC−MS platforms and a variety of laboratory-specific protocols. The sample preparation method, the operating conditions of the liquid chromatograph and the mass spectrometer, as well as the data analysis procedures are diverse, making the data sharing across laboratories and their integration very challenging. A previous report from the HUPO Test Sample Working Group indicated the limited uniformity of proteomics analyses replicated across various laboratories.1 The proteomics community has definitely recognized the necessity to harmonize the analysis protocols and, more specifically, the data collection, in order to fulfill some preset standards, thus ensuring reproducibility of results. While major progress was achieved regarding the standardization of the formats of the data generated through the implementation of HUPO-PSI standards2 and by guidelines of proteomics journals,3,4 the harmonization of actual analytical processes, including the assessment of the platform performance, remains a major challenge. One first step toward this objective is the definition of quality control (QC) procedures and acceptance metrics that can be routinely applied. In order to be broadly endorsed, such a procedure has to be simple and relatively fast to carry out. This calls for a simple, generic protocol that can assess the uniformity of an analytical process and then use the data generated to define a baseline for the qualification of future results, using rigorous quality control metrics. This topic has © 2014 American Chemical Society

been extensively discussed by the community, and the report of a HUPO workshop in 20095 has stressed the need for a test to (self-) assess the performance of a platform prior to performing large proteomics experiments and, more specifically, to address quantitative analyses. Several studies were conducted over the past years using targeted proteomics methods (for instance LC−SRM) to determine the intra- and interlaboratory reproducibility of quantitative measurements.6−9 They showed that inconsistencies in the sample preparation7,8 or irregular performance of LC−MS platforms6 often resulted in data sets leading to erroneous quantification results. These studies are valuable resources for the community to benchmark the proteomics workflow and to define a baseline. However, they rely on fairly complex protocols, for both the sample preparation and the data acquisition, which need to be iteratively optimized, and on a large number of analyses, including the measurement of reference peptides dilution series. Although very useful, their routine implementation (on a day-to-day or even a week-toweek basis) might be demanding for laboratories not performing quantitative measurements. Thus, there is a need for a simple method to assess the uniformity of proteomics experiments that can be implemented and used to systematically detect “drift” early on. Received: November 27, 2013 Published: March 11, 2014 2688

dx.doi.org/10.1021/pr4011712 | J. Proteome Res. 2014, 13, 2688−2695

Journal of Proteome Research

Technical Note

Table 1. List of Reference Proteins/Peptides

a

K = 13C6H1415N2O2; A = 13C3H715NO2; G = 13C2H515NO2; P = 13C5H915NO2; R = 13C6H1415N4O2.

ported.10−12 However, the quality control protocol proposed here is, to our knowledge, the first one to systematically assess a full analytical process on a day-to-day basis (or within a day), while requiring simple means. It is particularly useful when integrated in longitudinal studies performed over extended periods of time, to ensure consistency of the results.

Here, a simple procedure was designed to perform routine quality controls; it can be readily implemented in individual laboratories and used by any operator to systematically assess the uniformity of an analytical process. Based on a simple sample preparation method, it allows the assessment of the instrument performance and in turn the quality/reproducibility of the data generated over time. The protocol, using a simple set of proteins mixed in well-defined amounts, encompasses a tryptic digestion step, followed by the addition of calibrated amounts of isotopically labeled reference peptides at two distinct stages of the sample preparation procedure. The proposed protein mixture can be used either alone or after addition to a biological matrix. A single LC−MS (or LC−MS/ MS) analysis of the sample is performed to determine the signal intensity of the peptides generated by tryptic digestion, which are then compared to the corresponding isotopically labeled counterparts added to the sample both prior to its digestion and prior to its LC−MS analysis. The resulting data allowed the determination of (i) the efficiency of the tryptic digestion, (ii) the recovery of the biochemical sample preparation step, and more importantly, (iii) the overall reproducibility of the experiment. The main chromatographic and MS attributes of the reference peptides are routinely monitored, including the retention time and chromatographic peak shape, the mass accuracy, the signal intensity, and the signal-to-noise ratio. Based on replication of the analysis on an extended period of time, several metrics are determined, which allow the assessment of the quality of the separation, the sensitivity, and the accuracy of the mass measurements. In turn, this allows the detection of performance deterioration of the analytical platform over time. Similar approaches to monitor the performance of LC−MS platforms, using well-defined peptide mixtures, were re-



MATERIAL AND METHODS

Chemicals and Reagents

Dithiothreitol, formic acid, iodoacetamide, Trizma hydrochloride (Tris-HCl), and urea were obtained from SigmaAldrich (St. Louis, MO, USA). Sequencing grade modified trypsin was purchased from Promega (Madison, WI, USA). All solvents used (i.e., acetonitrile, water) were HPLC grade and purchased from Sigma-Aldrich. Sodium 3-[(2-methyl-2-undecyl-1, 3-dioxolan-4-yl) methoxyl]- 1-propanesulfonate (trade name RapiGest SF) was obtained from Waters (Milford, MA, USA). Reference Materials: Mixture of 3 Proteins; Mixture of Isotopically Labeled Peptides. The mixture of three commercialy available yeast proteins, i.e., enolase (P00924), alcohol dehydrogenase (P00330), and glucose-6-phosphate dehydrogenase (P11412), was prepared based on proteins purchased from Sigma-Aldrich and resuspended individually in urea 6 M/Tris-HCl 0.1 M at a concentration of 2 mg/mL. They were mixed to obtain a final concentration of 0.5 mg/mL. The isotopically labeled peptides were purchased from Thermo Fisher Scientific (Ulm, Germany) and correspond to the endogenous peptides used as surrogates for the proteins of interest (Table 1). Sample Preparation. The denaturated proteins (15 μL, volume corresponding to 7.5 μg of each protein) were reduced 2689

dx.doi.org/10.1021/pr4011712 | J. Proteome Res. 2014, 13, 2688−2695

Journal of Proteome Research

Technical Note

The digestion was performed by adding 130 μL of sequencing grade modified trypsin. The sample was incubated overnight at 37 °C. A second step of trypsinization was carried out by adding 2 μL of trypsin for 2 h at 37 °C. The resulting peptides were cleaned on Sep-Pak tC18 cartridges (Waters, Milford, MA, USA) and eluted with 50% acetonitrile. The sample was then lyophilized on a vacuum centrifuge and resolubilized in 0.1% formic acid. Test Mixture Spiked in Human Plasma Digest. The test mixture prepared above was spiked at different concentrations (0.009, 0.03, 0.08, 0.25, 0.7, 2, 7, 20 fmol/μL) into 500 ng/μL of human plasma digest.

with 5 μL of 20 mM dithiothreitol (5 mM final concentration) for 30 min at 37 °C and alkylated with 5 μL of 75 mM iodoacetamide (15 mM final concentration) for 30 min at 25 °C in the dark. The protein mixture was supplemented with a first mixture of isotopically labeled peptides (HBx peptides, see Figure 1 and Table 1) at a final nominal concentration of 50

Liquid Chromatography and Mass Spectrometry

LC Separation. The peptides were separated using a Ultimate 3000 RSLC nano system (Thermo Fisher Scientific). The samples were loaded onto a trap column Acclaim PepMap 2 cm × 75 μm i.d., C18, 3 μm, 100 A (Thermo Fisher Scientific) at 5 μL/min with aqueous solution containing 1% (v/v) HPLC grade acetonitrile and 0.05% trifluoroacetic acid. After 3 min, the trap column was set online with an analytical column Acclaim PepMap RSLC 15 cm × 75 μm i.d., C18, 2 μm, 100 A (Thermo Fisher Scientific). The elution was carried out by applying a mixture of solvent A/B. Solvent A was HPLC grade water with 0.1% (v/v) formic acid, and solvent B was HPLC grade acetonitrile with 0.1% (v/v) formic acid. Peptides were separated by applying a linear gradient of 2−35% solvent B at 300 nL/min over 33 min. One microliter of each sample was injected. Analyses on Triple Quadrupole Instrument. Analyses of the samples were performed by using a TSQ Vantage extended mass range triple quadrupole mass spectrometer (Thermo Scientific, San Jose, CA, USA) in selected reaction monitoring (SRM) mode. A dynamic nanoelectrospray source was used with uncoated SilicaTips, 12 cm length, 360 μm o.d., 20 μm i.d., 10 μm tip i.d.. For ionization, 1200 V liquid junction voltage and 250 °C capillary temperature were used. The selectivity for both Q1 and Q3 was set to 0.7 Da, and the collision gas pressure of Q2 was set at 1.5 mTorr argon. The time-scheduled SRM method targeted 6 triplets of isotopically labeled peptides/endogenous peptides in ±3 min retention time windows by monitoring five transitions for each peptide within a cycle time of 2.5 s. The list of transition is given in Supplementary Data 1. Analyses on Quadrupole Orbitrap Instrument. The analyses were performed using both single ion monitoring (SIM) and parallel reaction monitoring (PRM) modes on a QExactive mass spectrometer (Thermo Scientific, Bremen, Germany). The nanoelectrospray source and liquid chromatography settings were identical to those used for analyses performed on the triple quadrupole mass spectrometer. For ionization, 1500 V liquid junction voltage was used. For both types of analyses, the acquisition method included a full scan (FS) event, set with a resolution of 70000 (at m/z 200), a target automatic gain control (AGC) value of 1 × 106, and a maximum filling time of 250 ms over the mass range 300−1500 m/z. The second event consisted in the isolation of the target ions of the inclusion list with a 2 m/z window, a resolution of 35000 (at m/z 200), a target AGC value of 1 × 106, and a maximum filling time of 120 ms. In PRM, the normalized collision energy was set at 25. The time-scheduled method targeted 6 triplets of isotopically labeled peptides/endogenous peptides in ±2 min retention time windows. The list of target

Figure 1. Diagram of the analytical procedure. A mixture of three commercially available yeast proteins was prepared. Prior to digestion, isotopically labeled peptides (HB) representing a subset of tryptic peptides of the three proteins were added to the mixture in amounts equimolar to the endogenous peptides. A second set of isotopically labeled peptides (HA, same amino acid sequences but a different isotope incorporation pattern) was added in the same amounts after digestion and prior to LC−MS(/MS) analyses. The analyses were performed using either full scan (FS) or targeted (SRM, SIM, PRM) acquisition modes on a triple quadrupole or a quadrupole-orbitrap instrument.

fmol/μL in the sample analyzed. Urea concentration was reduced to 1 M by dilution with 25 mM Tris-HCl, and 5.6 μL of sequencing grade modified trypsin was added to a final enzyme:substrate ratio of 1:20. The sample was incubated for 4 h at 37 °C. Prior to LC−MS analyses, the mixture was supplemented with a second mixture of isotopically labeled peptides (HAx peptides, see Figure 1 and Table 1) at a final nominal concentration of 50 fmol/μL. Human Plasma Digest. Plasma pooled from deidentified human specimens was provided by Integrated Biobank of Luxembourg (IBBL) and treated as “not human subjects research” material. The plasma sample was depleted of the two most abundant proteins. Remaining proteins were denaturated by heating at 99 °C for 10 min and then by addition of 0.2% RapiGest for a final concentration of 0.1% (w/v). The proteins were reduced with 20 μL of 100 mM dithiothreitol (10 mM final concentration) for 50 min at 50 °C and alkylated with 55 μL of 100 mM iodoacetamide (25 mM final concentration) for 30 min at room temperature in the dark. The alkylation reagent was quenched by adding 8.3 μL of 100 mM dithiothreitol (3 mM final concentration) for 30 min at 25 °C. Prior to the digestion, RapiGest was added to a final concentration of 0.1%. 2690

dx.doi.org/10.1021/pr4011712 | J. Proteome Res. 2014, 13, 2688−2695

Journal of Proteome Research

Technical Note

Figure 2. Analysis of the reference peptide mixture. The peptide mixture obtained through the analytical procedure (described in Figure 1) was analyzed by LC−MS(/MS) and resulted in the measurement of a triplet of signals of similar intensity for each set of isotopologous peptides. (A) The chromatograms extracted for the signals measured by LC−SRM analysis for the isotopic variants of the peptides GGYFDSIGIIR and NTVISVFGASGDLAK, derived from glucose-6-phosphate dehydrogenase, are shown (sum of the signals of five transitions) and indicate an abundance ratio close to 1:1:1 for the isotopologous peptides. (B) The mass spectra acquired by LC−SIM analysis for the peptides GGYFDSIGIIR and NTVISVFGASGDLAK at the apex of their chromatographic elution profile confirm an abundance ratio of 1:1:1 for the isotopologous peptides.

amounts of isotopically labeled synthetic peptides (i.e., equimolar to the amounts of the endogenous peptides). Second, the same set of peptide sequences, i.e., the same amino acid sequences but a different isotope incorporation pattern (see Table 1), was added to the peptide mixture after the digestion in the same molar amounts, this just prior to the LC−MS(/MS) analysis. An optional desalting step can be included in the process between the digestion and the LC− MS(/MS) analysis stages. As a consequence of this double addition of internal standards, the LC−MS analysis allowed, through the measurement of the signal triplets of the different isotopic variants, a straightforward read-out of the analytical process uniformity. The relative intensities of the signals within the triplets enabled the immediate assessment of the digestion efficiency, the overall recovery of the full sample preparation, and in turn the reproducibility of the experiment. The protocol was designed to be carried out on most LC− MS platforms operated under commonly used conditions, owing to the relatively high concentrations of the analytes in the final peptide mixture (i.e., nominal concentration of 50 fmol/μL). The main chromatographic and mass spectrometric attributes of the reference peptides were monitored, and the data were used to assess the performance of the LC−MS platform. The replication of the analysis over time allowed monitoring the status of the platform, which can ultimately be used to define metrics and acceptance criteria for a ‘local’ QC protocol and to detect the drifts from the predefined baseline. This procedure enabled the assessment of both sample preparation and LC−MS performance in one single analysis, as

precursor ions is given in Supplementary Data 1. For PRM analysis, the fragment ions selected for subsequent data processing is also indicated in Supplementary Data 1 and correspond to the transitions monitored in SRM analysis. Data Processing

Data analysis was performed using Xcalibur (Vers. 2.2, Thermo Fisher Scientific) and Skyline (Vers. 1.4 University of Washington). The area under the curve (AUC) of each target peptide ion (SIM analysis), target transition (SRM analysis), and selected fragment ion (PRM analysis) was calculated for all of the isotopic variants of the peptides of the reference mixture. The AUCs of each individual SRM or PRM transition were then summed together to quantify each peptide.



RESULTS AND DISCUSSION

Description of the Quality Control Procedure

The easy to implement quality control procedure, illustrated in Figure 1, was designed with the purpose of being “simple” and broadly applicable. Three commercially available yeast proteins were chosen to prepare the reference material used in this study (enolase, alcohol dehydrogenase, and glucose-6-phosphate dehydrogenase). For each of the three proteins, two tryptic peptides and their corresponding synthetic isotopically labeled analogues were selected to evaluate the process. The selection of the peptides representing the proteins was carried out by avoiding amino acids with chemical and post-translational modifications or sequences coded in the human genome. The addition of synthetic analogues to the sample was performed at two distinct stages of the process. First, the protein mixture was supplemented prior to the digestion step with calibrated 2691

dx.doi.org/10.1021/pr4011712 | J. Proteome Res. 2014, 13, 2688−2695

Journal of Proteome Research

Technical Note

Figure 3. Reproducibility of the sample preparation protocol. The sample preparation was replicated by four operators over 8 weeks, and the resulting peptide mixture was analyzed by LC−SRM and LC−PRM on triple quadrupole and quadrupole-orbitrap instruments, respectively. The intensity ratios determined over the 8 weeks for the endogenous form (L) of the peptide GVIFYESHGK (derived from alcohol dehydrogenase) and its isotopically labeled counterpart (HA) added prior to LC−MS/MS analysis are displayed in a control chart. The mean value of the intensity ratios was determined without the outliers (data points generated by operator 2), removed by a Grubbs’ outliers test (P ≤ 0.05), and is represented with a dashed gray line at 1.01.

part added immediately prior to the analysis. Assuming no error in the pipetting operations during the entire process, the signals of the endogenous peptides are expected to be similar to the signals of their internal standard (Supplementary Figure 1A). The signal intensities of the endogenous peptides are indicative of the recovery, and a significant decrease reflects a loss or a poor tryptic digestion, suggesting that some experimental conditions in the sample preparation were not correct (Supplementary Figure 1B). Then, if the addition of the standard prior to the digestion is considered (Supplementary Figure 1C-E), changes that occurred during the first part of the sample preparation (including digestion) or following the digestion step (e.g., dilution or cleanup) can be distinguished. A comparison showing endogenous peptide intensities lower than the intensities of the reference peptides added prior to the digestion reveals a deviation at one of the early stages of the process (e.g., the denaturation conditions) or during the digestion step, such as a suboptimal enzyme-to-substrate ratio (Supplementary Figure 1C). The measurement of the intensities of the reference peptides added prior to the digestion lower than those of their variants added prior to the analysis (Supplementary Figure 1D) indicates losses occurring in the last part of the process (e.g., due to an extended reaction time, to the storage of the sample at room temperature, or to the desalting step if included). It is worth noting that both types of inconsistencies can occur concomitantly (Supplementary Figure 1E). The sample preparation method was extensively tested by replicating the protocol, by four operators with different skill levels over an extended period of time (8 weeks), and by monitoring isotopologous variants of the different peptides analyzed in LC−SRM and LC−PRM modes. In our hands, the sample preparation showed very consistent results over time, characterized by intensity ratios of isotopic variants between 0.85 and 1.15, distributed around the expected value of 1.0. This is in line with FDA and EMA guidelines requiring a maximum deviation of ±15% around nominal values for accurate quantitative measurements.15,16 These results were obtained after removing the data points generated by a nonexperienced operator (2), which showed a systematic bias toward higher values for endogenous peptides (most likely due

the experiment was designed to circumvent the convolution of the two components of the process. Protocol To Assess the Sample Preparation

The sample preparation protocol relies on experimental conditions commonly employed in proteomics, with emphasis on simplicity and robustness. Several conditions were evaluated for the enzymatic digestion of the proteins constituting the reference material, including various buffers (ammonium bicarbonate, Tris-HCl), chaotropic reagents (urea, guanidine hydrochloride), and different concentrations, temperatures, and reaction times. The ultimate conditions were chosen to ensure a nearly complete tryptic digestion (i.e., absence of obvious missed cleavages) while avoiding the degradation of the products (assessed by monitoring the relative intensities of the pairs of reference peptides added prior to and after digestion).13,14 The mass spectrometric data analysis and interpretation were performed based on the assumption that full digestion was achieved and thus equimolar amounts of endogenous peptides corresponding to each of the proteins should be observed. The previously determined relative response factors of the peptides were used to calibrate the amounts of isotopically labeled peptides that were added at different stages of the process. The reference solutions were adjusted such that the amount of each endogenous peptide and its labeled counterparts added during the process were identical. Consequently, a triplet of signals with similar intensity was expected for the three isotopologous variants in the LC−MS(/MS) analysis (1:1:1 pattern) as illustrated in Figure 2 for the peptides derived from the glucose6-phosphate dehydrogenase. In this example, the peptide mixture was analyzed on a triple quadrupole instrument in SRM mode and on a high resolution/accurate mass quadrupole-orbitrap instrument operated in SIM mode. Inconsistencies in the sample preparation directly translate into nonequimolar amounts of isotopologous peptides in the peptide mixture and are reflected in the relative intensities observed for the triplets of isotopologous peptides, which represent a simple read-out to be used for an immediate and precise diagnosis tool. The evaluation strategy is as follows. First, the signals of each endogenous peptide are compared with those of their corresponding isotopically labeled counter2692

dx.doi.org/10.1021/pr4011712 | J. Proteome Res. 2014, 13, 2688−2695

Journal of Proteome Research

Technical Note

Figure 4. Monitoring of the performance of the LC−MS platform. The reference peptide mixture was analyzed by LC−SRM, and the performance metrics of the platform (retention time, peak width at half-maximum, and peak area) were captured for the isotopically labeled peptides added prior to analysis. (A) The values measured over 1 week for the most intense transition of the isotopically labeled form of the peptide GVLHAVK (derived from enolase) are plotted in control charts. Acceptance criteria were previously established from 36 analyses of the reference peptide mixture under well-defined operating conditions over a 3-month period. The tolerance values for peak width and peak area were set at 2 standard deviations above and below mean values, respectively (solid line). The acceptance range of the retention time was established at ±1 min around the mean retention time value. (B) The coefficients of variation (CV) determined for the metrics for each peptide of the mixture are shown.

deviations above and below mean values, respectively. Regarding peptide retention times, a tolerance window of ±1 min around the mean retention time values was established, which is in line with the requirement of a targeted experiment. For subsequent analyses, the values measured (and extracted with widely used software) were typically plotted in a control chart to facilitate their visualization and the detection of outliers. This process is illustrated in Figure 4A with the display of the retention time, chromatographic peak width, and peak area (area under the curve) captured for the isotopically labeled form of the reference peptide GVLHAVK (derived from enolase) added prior to analysis. The measurements were performed by LC−SRM, and the results obtained over a limited period of time (1 week) are presented. For all peptides of the reference mixture (Figure 4B), the chromatographic attributes were very stable, as usually observed over such a period (CV of retention time