Article pubs.acs.org/ac
Improved Protein Hydrogen/Deuterium Exchange Mass Spectrometry Platform with Fully Automated Data Processing Zhongqi Zhang,* Aming Zhang, and Gang Xiao Process and Product Development, Amgen Inc., One Amgen Center Drive, Thousand Oaks, California 91320, United States S Supporting Information *
ABSTRACT: Protein hydrogen/deuterium exchange (HDX) followed by protease digestion and mass spectrometric (MS) analysis is accepted as a standard method for studying protein conformation and conformational dynamics. In this article, an improved HDX MS platform with fully automated data processing is described. The platform significantly reduces systematic and random errors in the measurement by introducing two types of corrections in HDX data analysis. First, a mixture of short peptides with fast HDX rates is introduced as internal standards to adjust the variations in the extent of back exchange from run to run. Second, a designed unique peptide (PPPI) with slow intrinsic HDX rate is employed as another internal standard to reflect the possible differences in protein intrinsic HDX rates when protein conformations at different solution conditions are compared. HDX data processing is achieved with a comprehensive HDX model to simulate the deuterium labeling and back exchange process. The HDX model is implemented into the in-house developed software MassAnalyzer and enables fully unattended analysis of the entire protein HDX MS data set starting from ion detection and peptide identification to final processed HDX output, typically within 1 day. The final output of the automated data processing is a set (or the average) of the most possible protection factors for each backbone amide hydrogen. The utility of the HDX MS platform is demonstrated by exploring the conformational transition of a monoclonal antibody by increasing concentrations of guanidine.
P
run to run irreproducibility. In order to reduce loss of deuterium labeling during analysis, the proteolytic digestion and liquid chromatography (LC)/MS analysis must be performed quickly at ∼0 °C. Even under well-controlled pH and temperature, a large amount of labeled deuterium will be lost during analysis.6,17 When comparing protein conformation at different conditions, this deuterium back exchange is not detrimental if it is controlled reproducibly. However, the difficulty of controlling the quench and digestion conditions often causes poor reproducibility between runs. Second, the HDX data analysis is a very time-consuming and labor intensive step, particularly for large proteins, which usually generates hundreds of peptides for deuterium quantitation18 as well as plentiful information of HDX kinetics. This problem has only been partially solved by the availability of several semiautomated computer programs that are developed by different groups.19−24 These programs greatly reduce the data processing time and therefore improve the throughput of HDX analysis. They usually generate a series of deuterium incorporation time courses for all the analyzed peptides and present the final processed HDX data in either a “heat map”25 or “butterfly chart”.16 In either case, there is still plenty of information residing in HDX kinetics as well as the labeling information from large number of overlapping peptides that is not utilized.
rotein backbone amide hydrogen exchange has been widely used to study protein conformation and conformational dynamics.1−3 In hydrogen/deuterium exchange (HDX), deuterium incorporations at individual backbone site are usually determined by nuclear magnetic resonance spectroscopy (NMR). However, resonance assignment is a time-consuming process for NMR, and its application to large proteins is limited by the requirement of isotope-labeled proteins. A method with intermediate structural resolution is to perform a proteolytic digestion of the deuterium labeled protein at slow exchanging conditions,4,5 followed by mass spectrometric analysis.6 This method has gained enormous popularity since its development and is becoming the method of choice for studying proteins of intermediate to large size.7−12 Measurement of deuterium contents in short peptides by mass spectrometry (MS) has the advantage of specificity, sensitivity, speed, and convenience of peptide identification. The recent advances in ultrahigh-performance liquid chromatography (LC) and ultrahigh-resolution mass spectrometry make the MS-based HDX method more facile by providing fast separation of large number of peptides from protein digestion as well as unambiguous resolution of coeluting peptides, and therefore facilitating fast analysis of large proteins.13,14 Although gaining popularity recently due to the efforts of many groups in developing more robust system for collecting and analyzing protein HDX MS data, the HDX-MS technique confronts two major problems that disallow its wide application, particularly in the field of biopharmaceutical development.15,16 First, poor control of back exchange causes © 2012 American Chemical Society
Received: February 27, 2012 Accepted: May 4, 2012 Published: May 4, 2012 4942
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
with a resolution of 60 000 (at m/z 400) in centroid mode. Prior to the HDX experiment, an unlabeled protein sample was analyzed for three times by data-dependent LC/MS/MS for peptide identification purpose. Two data-dependent MS/MS scans were performed in the linear ion trap after each full MS scan. External Back Exchange Controls. To determine the levels of deuterium loss during analysis, a control with zero labeling (nondeuterated control or 0% D control) and a control with full labeling (fully deuterated control or 100% D control) were performed.6 The 0% D control was obtained by direct quenching of an unlabeled protein solution, and the 100% D control was obtained by performing deuterium labeling in the presence of 6.9 M guanidine for at least 8 h. Internal Standards. An equal concentration of a peptide mixture (bradykinin, angiotensin I and leucine enkephalin) was added to the unlabeled protein solution as well as the D2O buffer as internal back exchange standards. A tetrapeptide PPPI (synthesized by AnaSpec, Fremont, CA) was mixed into the protein solution with a similar or less molar concentration to the protein as an internal standard for determining the intrinsic HDX rates. Data Processing. All data were processed on MassAnalyzer.28 MassAnalyzer was developed in Microsoft Visual C++ and run under Microsoft Windows. It has been extensively used for full sequence characterization of purified proteins, including post-translational and process-induced modifications. New HDX data extraction and modeling functions, as described below, were implemented into MassAnalyzer for fully automated analysis of HDX MS data collected on Thermo Scientific high-resolution mass spectrometers. To test MassAnalyzer for noncommercial research purposes, please contact the corresponding author directly.
Several approaches have been attempted to convert deuterium incorporation data to more meaningful exchange rate information,26,27 but none has reached a completely unattended stage. Noticeably, there is an exceptional algorithm developed by Althaus et al.27 It was based on combinatorial optimization of three classes of backbone protons with fast-, intermediate-, and slow- exchange rates to fully fit into the deuterium labeling of all overlapping peptides in HDX analysis to predict exchange rate for single residues in protein. However, the method does not completely model the HDX process for single residues, and further it does not consider the effects of deuterium back exchange during digestion and LC/MS analysis. The work described in this article outlines a comprehensive HDX model at a single residue level and incorporates several strategies to address the problems mentioned above. First, the back exchange variation that causes the majority of HDX measurement deviations was corrected by introducing a set of fast-exchanging internal peptide standards. Meanwhile, a designed unique peptide (PPPI) with a slow intrinsic HDX rate was introduced as a second internal standard to adjust possible differences in intrinsic rates at different solution conditions where protein conformations are compared. Second, the comprehensive HDX model is built to simulate the whole deuterium labeling and back exchange processes during the digestion and analysis. The HDX model utilized the maximal information of the entire HDX MS data set (both the HDX kinetics and the labeling information from all overlapping peptides) to derive the most likely protection factors for each backbone amide hydrogen. The output of protection factor for each single residue provides clear and far more detailed protein conformational information relative to other available HDX programs.
■
■
EXPERIMENTAL SECTION The following is a brief description of the experimental procedures. Refer to the Supporting Information for details. Deuterium Labeling and Measurement. Deuterium labeling was initiated by diluting an antistreptavidin IgG1 monoclonal antibody (mAb) solution (Amgen, Thousand Oaks, CA) 10-fold into a D2O buffer containing various concentrations of deuterated guanidine. After a set length of labeling time (6 labeling times from 30 s to 8 h, each in duplicate), the labeling was quenched by diluting 4-fold into a quench/denaturation buffer (pH 2.7) containing 0.625 M tris(2-carboxyethyl)phosphine (TCEP) and appropriate concentration of urea at 1 °C. The concentration of urea was adjusted according to the guanidine concentration so that the denaturation capacity remains similar. An aliquot of the quenched solution was then transferred to a pepsin solution, followed by injecting into the sample loop at 1 °C, and a 6 min digestion delay before the loop was switched inline for gradient elution. The entire procedure was performed automatically on a Leap PAL HD-X system controlled by Leap Shells (Leap Technologies, Carrboro, NC). LC/MS analyses were performed on an Agilent (Santa Clara, CA) 1290 Infinity system coupled to a Thermo Scientific (San Jose, CA) LTQ-Orbitrap high-resolution mass spectrometer with an electrospray ionization interface. The proteolytic peptides were separated on a Waters (Milford, MA) BEH C18 column (2.1 mm × 50 mm), at 1 °C, with acetonitrile gradient at a flow rate of 0.36 mL/min, with 0.02% trifluoroacetic acid (TFA) and 0.1% formic acid in each mobile phase. Mass spectrometric data were acquired in the Orbitrap
COMPUTATIONAL METHOD Automated data processing on MassAnalyzer includes ion feature detection and alignment, peptide identification, deuterium level calculation, deuterium variance calculation, back exchange modeling, intrinsic exchange rate calculation, and HDX modeling. MassAnalyzer reads Thermo Scientific Xcalibur raw data directly and outputs the possible protection factors of each backbone amide position in a fully unattended fashion. The steps involved in the automated data processing is described below and also summarized in the Supporting Information, Table S-8. Ion Detection and Alignment. The process of ion feature detection has been described previously.28 After ion feature detection in each run, retention times of detected features in all runs are aligned using a recursive divide-and-conquer algorithm.29 Because of the high mass accuracy of the instrument used in this work, the isotope envelope of a deuterated ion can be unambiguously aligned to its undeuterated counterpart by applying appropriate mass shift caused by HDX. Mass spectrum of a feature is obtained by combining spectra across the chromatographic peak using a matched filter,30 with background subtraction, to optimize the S/N of the combined spectrum. From the combined spectrum, the average mass of the ion is determined by calculating the centroid of its isotope peaks. This centroid can be accurately determined because most overlapping isotope envelopes do not interfere with each other, given the high-resolution mass spectrometer used in this study. This average mass will be later used to determine the deuterium content of each peptide. 4943
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
Peptide Identification. Peptides are identified from their determined masses and MS/MS fragmentation patterns by comparing the experimental MS/MS to the accurately predicted theoretical MS/MS.31−34 When the sequence search space is small, peptide identification using this approach is more reliable than conventional approaches.28 N-glycosylated peptides with different glycoforms are identified automatically.33 All other post-translational or process-induced modifications are manually defined by the user before data processing. Deuterium Level Calculation. In our HDX experimental setup, there is a small amount of deuterium present in the digestion solution (∼4.5% in this study, due to a 20-fold dilution of the 90% D2O labeling buffer; see the Supporting Information). Therefore, a 0% deuteration control (0% D control)6 is designed and performed to determine the amount of deuterium labeling caused by this low percentage of deuterium during digestion. The deuterium level in a peptide from a labeled sample at the time of MS measurement (Dexp) is calculated by taking the difference between the average mass of the labeled peptide (M) and that of the 0% D control (M0%), as shown in eq 1, with the rationale described in the Supporting Information. Dexp = M − M 0%
accurately reflect the level of deuterium loss because of the poorly defined pH, temperature, LC stationary and mobile phase conditions, and ionization conditions. A general approach to correct the level of back exchange is by using the 100% D control in the deuterium calculation.6 With this approach, however, accurate back exchange correction is only possible when either all backbone amide positions in a peptide have the same deuteration level or all amide deuteriums in a peptide have the same back exchange rate.6 A better approach, as described below, is to use the theoretically calculated back exchange rate to model the back exchange process for each amide hydrogen and meanwhile introduce the parameter of effective back exchange time for each peptide to reflect the uncertainty in back exchange conditions. Because the back exchange rate of an amide deuterium can be different when it is in the undigested protein or in a digested peptide due to the terminal effects,35,36 we assume that each amide deuterium i in a peptide j has two back exchange rate constants: kbefore with an effect time tbefore and kafter with an i j i after effective time tj . The deuterium level of a fully deuterated peptide j after back exchange can be calculated by Djcalc =
(1)
∑ [exp(−kibeforet jbefore) exp(−kiaftert jafter)] i
kibefore
(3)
kiafter
Note that and are exchange rate constant calculated theoretically,36 with kibefore calculated with the assumption that i is present in the undigested protein and kafter calculated with the assumption that i is present in the i digested peptide. tbefore and tafter are two parameters in the j j model, representing the two effective back exchange times for peptide j. In practice, a slight difference in experimental conditions from run to run often causes changes in the amount of back exchange, therefore causing variations in determined level of deuterium. In this platform, we use several short peptides as internal standards to adjust these run-to-run variations. These variations are corrected by introducing small changes (Δtbefore r and Δtafter r ) to the effective back exchange times for each run r. Equation 3 then becomes
The calculated deuterium levels are fed into the HDX model described below. Several strategies for quality control of these determined deuterium levels are described in the Supporting Information. Deuterium Variance Calculation. To build a model to describe the deuterium labeling and back exchange process, the variance (σ2) of the measured deuterium content must be determined for each ion in each run. In our platform, a mixture of short peptides (bradykinin, angiotensin I, and leucineenkephalin) is used as internal standards to correct run-to-run variations of deuterium back exchange. To a first approximation, the deuterium levels in these internal standards are used to determine a correction factor for each run, and the determined deuterium level (Dexp from eq 1) in each labeled peptide is corrected by multiplying this factor to normalize all Dexp to a representative 100% D control run. The variance is then calculated from these corrected deuterium levels. For the 100% D controls, the deuterium variance for each peptide ion is calculated directly. For partially labeled samples, the deuterium incorporation time course is first fitted with the following 5parameter equation6
Djrcalc =
∑ {exp[−kibefore(t jbefore + Δtrbefore)] i
exp[ −kiafter(t jafter + Δtrafter)]}
(4)
The values of two effective back exchange times for each peptide (tbefore and tafter j j ) and two adjustments for each run (Δtrbefore and Δtrafter) are determined by minimizing the difference between the calculated deuterium levels Dcalc and jr the determined levels in all fully deuterated control peptides, including proteolytic peptides in the 100% D control samples as well as internal back exchange standards in all samples. The difference between the predicted deuterium levels and experimental levels are represented by the reduced χ2 as shown below.
Dcalc = N1[1 − exp(−k1t )] + N2[1 − exp(−k 2t )] + k 3t (2)
where N1, N2, k1, k2, and k3 are the five parameters representing the number of amide hydrogens and their exchange rate constants, and t is the exchange duration. The variance is then calculated from the residuals between the experimental data points and the fitted curve. Because eq 2 has 5 parameters, for accurate calculation of the variance, a minimum of 6 time points, with a duplicate each, are recommended to have a minimum degree-of-freedom of 7. Although duplicate analysis is performed for each time point in this work, the system allows any number of replicates. Refer to the Supporting Information for details on deuterium variance calculation. Back Exchange Modeling. Deuterium back exchange rates in unprotected peptides can be theoretically calculated based on the sequence of the peptide, pH, and temperature.35,36 The calculated degree of back exchange, however, does not
χred 2 =
1 n − nj − nr
∑∑ j
r
(Djrcalc − D100% )2 jr (σjr100%)2
(5)
where j are all fully labeled peptide ions, r are all runs, n is the total number of deuterium measurements, nj is the number of fully labeled peptide ions, and nr is the total number of runs. This back exchange model is an underdetermined system due to the large number of parameters (two effective times for each 4944
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
intrinsic HDX rate constant of each amide hydrogen must be calculated. These rate constants are usually calculated theoretically using the empirically derived formula developed in Englander’s group.35,36 In most applications, however, researchers are more interested in comparing protein conformational differences at multiple conditions of interest. In many cases, Englander’s model does not reflect the possible differences in intrinsic exchange rates between different conditions, for example, increasing concentration of guanidine in this study. To overcome this problem, we introduced a second internal peptide standard (differing from the internal peptide standards for back exchange correction) with the sequence of PPPI to adjust the intrinsic exchange rate difference in multiple conditions. This designed short peptide is assumed to have no conformational protection; meanwhile it has only one backbone amide hydrogen with an exchange half-life of about 2 min at neutral pH and room temperature, which allows experimental determination of the real intrinsic exchange rate in our labeling time scale. Using the HDX model described in the next section, the “protection factor” of peptide PPPI in each HDX condition is first calculated. The intrinsic exchange rate of PPPI used in the model is calculated theoretically based on Englander’s empirical formula assuming the proline residue near isoleucine is in the trans form. Because peptide PPPI should not have any conformational protection, an observed “protection factor” in each condition is a reflection of deviation of the real solution condition from that used in intrinsic exchange rate calculation. The actual intrinsic HDX rate constant for each amide hydrogen i on the studied protein is then calculated using the “protection factor” of PPPI as a correction factor as shown in the following equation.
peptide and two effective time adjustments for each run) in the model. To determine these parameters, additional constraints are needed. In theory, all peptides experience similar digestion time, and their total back exchange times correlate to their retention times. Therefore, the constraint shown in eq 6 is used to make tbefore as close to a constant value as possible, tbefore + j j after tj correlating to the retention time as much as possible, and the two adjustments Δtbefore and Δtafter as close to zero as r r possible. C=
∑ (t jbefore −
t before)2 +
j
− t jpredicted)2 +
∑ (t jbefore + t jafter j
∑ (Δtrbefore)2 + ∑ (Δtrafter)2 r
r
(6)
where tbefore represents the average values of tbefore across all j peptides, tpredicted is the predicted total effective time (tbefore + j j before tafter ) for peptide j on the linear regression line when t + j j tafter is plotted against the peptide retention time. j To optimize the back exchange model, the following function is minimized Q = χred 2 + λC
(7)
The value of Lagrange multiplier λ is determined by trying different values of λ, minimizing Q until the value of χred2 is close to a user-defined value, which is typically set as 1% higher than the minimized χred2 value when λ = 0. By minimizing Q, the effective back exchange times for each peptide j in each run r (tbefore and tafter jr jr ) can be determined. t jrbefore = t jbefore + Δtrbefore
(8)
t jrafter = t jafter + Δtrafter
(9)
kiintrinsic =
The back exchange modeling using peptide internal standards is only valid when the backbone amide hydrogens in these peptides are fully exchanged during the shortest labeling time (30 s in this work), which may not be necessarily true. This issue is conveniently resolved by adding the peptide mixture into both the unlabeled protein solution (in H2O) and the D2O labeling solution with equal concentration before the experiment. Because the internal peptide standards in the D2O solution are exposed to D2O for an extended period of time, all amide hydrogens on the peptides are fully deuterated. When the protein solution and D2O solution are mixed together to initiate HDX (for example, with a 1:9 mixing ratio, leading to 90% deuterium in the labeling solution), both the fully labeled and unlabeled peptide standards will have HDX moving toward the 90% D equilibrium state. Their average deuterium levels will be always at 90% at any given time (demonstrated in the Supporting Information). Further, because HDX in these short peptides is very fast, a bimodal isotope envelope is usually not observed. The only difference in the case of incomplete exchange is a slightly broader isotope envelope. It should be noted that these internal standards may not be suitable when the deuterium labeling is performed under low pH, when the HDX of these peptides is too far away from equilibrium. When internal back exchange standards are not used, all Δtbefore and r Δtafter values will be zero. r Intrinsic Exchange Rate Calculation. The ultimate goal for most HDX applications is to determine the protection factor of each backbone amide hydrogen. The model described here attempts to fulfill this goal. To get protection factors, the
kicalc P(PPPI)
(10)
kcalc i
where is the theoretically calculated intrinsic rate constant for amide hydrogen i, and P(PPPI) is the protection factor determined for peptide PPPI. Therefore, in our platform, the intrinsic exchange rate of each amide hydrogen is a combination of experimental determination and theoretical calculation. HDX Modeling. The HDX model contains protection factors of all backbone amide hydrogens (referred to as amino acid residues) as parameters. Optimization of the model, as described below, using experimental data will generate protection factors for all residues. With intrinsic HDX rate constants, protection factors, back exchange rates, and effective back exchange times available, the deuterium level in each peptide j for each run r at the time of MS measurement can be calculated by the following equation Djrcalc =
⎧ ⎡ ⎪
⎛ k intrinsic ⎞⎤ i tr ⎟⎥exp( −kibeforet jrbefore) Pic ⎠⎥⎦ ⎝
∑ ⎨⎢1 − exp⎜− i
⎩⎢⎣ ⎪
⎫ ⎪ exp( −kiaftert jrafter)⎬ ⎪ ⎭
(11)
Pci
where i represents amide positions in peptide j, represents the protection factor for amide position i at condition c, tr is the exchange time for run r, and tbefore and tafter are two effective jr jr back exchange times calculated from eqs 8 and 9. The model is 4945
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
optimized by minimizing reduced-χ2 between predicted and experimental data, as shown in the following equation χred 2 =
1 n − nresiduesnconditions
∑∑ j
r
function minimization is a set of protection factors of all individual amide hydrogens under each HDX condition. Because the HDX MS data usually do not have single-residue resolution, the HDX model described above is therefore an under-determined problem, meaning that no single definitive protection factor should be obtained for each residue. For this under-determined system, an ideal approach is to perform exhaustive simulation to find all the possible solutions. In practice, we optimize the model multiple times (usually at least 4000 times) with random starting points, and the best 20 solutions (smallest Q values) among them are recorded.
(Djrcalc − Djrexp)2 σjr 2 (12)
where n is the total number of deuterium measurement, nresidues is the total number of residues in the protein, nconditions is the number of HDX conditions to be compared in the experiment (nresiduesnconditions represents the total number of protection factors to be determined), Djrexp is the experimentally determined deuterium content for peptide j in run r, and σjr2 is the variance of measurement for Dexp jr . To avoid overfitting the data, additional constraints are applied during model optimization. First, the differences of the protection factors of nearby residues are also minimized (to ensure a smooth protection profile across the sequence) (eq 13), i.e., the protection factor of a residue will take the values of its nearby residues if there is not enough data to determine its value. Csmooth =
∑ ∑ (log Pic − log Pic+ 1)2 c
■
RESULTS AND DISCUSSION Proteolytic Peptides. Before the HDX measurement, unlabeled antistreptavidin IgG1 was digested at the same condition as the HDX samples and analyzed by LC/MS/MS for peptide identification. The representative base-peak chromatogram of the LC/MS analysis is shown in the Supporting Information, Figure S-2. MassAnalyzer identified a large number of peptide ions, among which 394 peptide ions (305 distinct peptides) with relative strong signals and less interferences were automatically selected for further HDX data processing. These peptides gave 100% sequence coverage for both the heavy and light chain of the mAb molecule. The coverage map of these peptides is shown in the Supporting Information. Back Exchange Correction with Internal Standards for Variance Calculation. Before variance calculation, the difference in back exchange between each run is corrected using the internal back exchange standards. Figure 1 shows the
(13)
i
Equation 13 implies that for a missing sequence coverage, the protection factors of these residues will have similar values with their nearby residues. Additionally, when multiple HDX conditions are compared, the differences in protection factors between these conditions are also minimized (eq 14), i.e, the protection factors of a certain residue for different conditions are assumed to be the same unless HDX data are available to support their difference. Cdiff =
∑ ∑ (log Pic − log Pi)2 c
+
i
∑ ∑ (log Pic − log Pi − log Pic+ 1 + log Pi+ 1)2 c
i
(14)
Pci
In eqs 13 and 14, c represents HDX conditions, represents the protection factor of residue i at condition c, log Pi represents the average value of log(protection factors) for residue i across all HDX conditions. The first term in eq 14 ensures that the protection factors of different conditions are as close as possible, and the second term ensures that if there are differences in protection factors between different conditions, the differences are as consistent as possible for nearby residues. To optimize the HDX model, the flowing Q function is minimized. Q = χred 2 + λsmoothCsmooth + λdif Cdiff
χred2
Figure 1. Deuterium incorporations levels of heavy chain peptide 1-22 at 0 and 4 M guanidine, before and after the 1st-approximation back exchange correction. Duplicate analyses are performed for each time point. The curves are the curve-fitting result of eq 2 to the corrected deuterium incorporation levels. Variances are determined from the residuals of the corrected levels to the fitted curves.
(15)
is the reduced χ value calculated from eq 12. To determine the values of the two Lagrange multipliers (λ) in eq 15, both λ values are first set to zero, followed by Q minimization to get the χred2 value. Then different λsmooth values are tested to make χred2 slightly higher (user-defined parameter, usually increased by 1%). After an appropriate value of λsmooth is determined, different values of λdiff is tested to make χred2 slightly higher again (user defined, usually increased by 1%). All these processes are performed automatically. The Q function in eq 15 was minimized by a brief simulated annealing routine, followed by function minimization using the Davidon-Fletcher-Powell (DFP) algorithm.37 The result of 2
deuterium incorporation of a representative peptide ion with different labeling times before and after the first-approximation back exchange correction. It can be seen that the data for some time points become much tighter (i.e., less variations) after the correction. The corrected deuterium uptakes are fitted with a five-parameter equation (eq 2) as shown in Figure 1. The residuals between the corrected experimental data points and the fitted curve are used to calculate the variance. One of the biggest advantages of the platform is that the imprecision of the system is corrected by using the internal 4946
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
evidence indicating the absence of protection caused by interaction with the protein of study should be ideally established before use (data not shown). HDX Modeling. As described in the method, the HDX model uses protection factor of each residue as variables (eq 11). Optimization of the HDX model in eqs 11−15 using the entire HDX data set (deuterium levels of all the overlapping peptides at multiple labeling time points) leads to the determination of single residue protection factor for the studied protein. It has been the most prominent feature of this HDX model that differs from other currently available HDX MS data analysis programs. Figure 3 (top) shows the
standards. Because unexpected changes in the system over time may introduce large variations, comparisons of HDX data collected over weeks or months apart were generally not performed in the past. With these internal standards, however, these longer-term comparisons become more practical. Back Exchange Modeling. In back exchange modeling, the effective back exchange times for each peptide are determined from the fully exchanged proteolytic peptides in the 100%D control runs and internal peptide standards in all runs. When pH values of 2.7 (as read from a pH meter) and 2.4 (as calculated based on the mobile phase composition) are used for the digestion and LC step, respectively, both at 1 °C, the determined effective back exchange times are on average of ∼6 min before digestion and ∼25 min after digestion. The total effective back exchange time is generally longer than the actual back exchange time, suggesting that the true temperature each peptide experiences is higher than the set temperature during a certain stage of the analysis or there are some unknown mechanisms of back exchange in the system. In this work, the levels of back exchange of all peptides vary in the range of 25− 50%, showing some room for improvement. However, because the back exchange process is corrected by using the internal back exchange standards, it has not been a critical issue in our HDX analysis. Intrinsic HDX Rates. In this study, analysis of the IgG conformational change with increasing guanidine concentration is used to demonstrate our HDX platform. The effect of solution composition (0−4 M guanidine) on intrinsic HDX rates is measured by monitoring the deuterium incorporation curve of internal peptide standard PPPI. Figure 2 shows the
Figure 3. Solutions of the model expressed as a plot of protection factors (logarithm scale) for each residue in the heavy chain of antistreptavidin: (top) a single solution and (bottom) 20 superimposed solutions.
protection factor profile of antistreptavidin IgG1 heavy chain (with no guanidine in the solution) at the single residue level based on one of the best solutions of the 4000 optimizations. The fitting of predicted deuterium labeling to the experimental determination is shown in Figure 4 for a typical peptide. This HDX model is an under-determined system due to the fact that the obtained HDX MS data is at the peptide level, rather than the single residue level. The protection profile shown in Figure 3 (top) is only one of the solutions that fit the HDX data. To get a more accurate representation of the protein conformation, a total of 20 best solutions are derived and overlaid in Figure 3 (bottom). The χred2 values of the 20 solutions are in the range of 3.2−3.4, corresponding to the predicted deuterium levels with less than 2 standard deviations from the experimentally determined levels. If the protection factors could be determined unambiguously, all 20 solutions would give the same answer. However, if the protection factors cannot be determined to single-residue resolution by the experimental data, different values will be obtained among the 20 possible solutions. Figure 3 (bottom) shows that the
Figure 2. Deuterium incorporation time courses of peptide PPPI under different concentrations of guanidine at pD 7.1 (uncorrected pH meter reading without guanidine) and 25 °C. Shown in the table are the “protection factors” for the peptide at different guanidine concentrations.
difference of deuterium incorporation into the peptide at different concentrations of guanidine. The obvious difference caused by guanidine clearly suggests the importance of intrinsic exchange rate correction using the internal standard. The deuterium incorporation data are fitted with the HDX model to get the “protection factor” of each HDX condition (shown in Figure 2). These “protection factors” are used to correct the theoretically calculated intrinsic HDX rates of all other amide hydrogens in the protein (eq 10). Notice should be taken that the peptide PPPI used as the control of intrinsic exchange rate is assumed to have no conformational protection. Therefore, 4947
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
Figure 4. Comparison of experimentally determined deuterium levels and levels predicted by the model for the heavy-chain peptide 1-22 at 0 and 4 M of guanidine.
determined protection factors vary dramatically among the 20 solutions in most of the regions, indicating the lack of single residue HDX information in the experimental data. However, it is interesting to note that in a few particular regions, welldefined protection factors do exist among 20 solutions (also see the Supporting Information, Figure S-3), indicating the availability of near single residue level information in the HDX MS data, most likely due to the presence of many overlapping peptides from the same protein regions. This observation also suggests the promising application of the HDX platform in processing future HDX MS data with electrontransfer dissociation (ETD) technology, a direction of development in this field to obtain real single residue HDX information.38,39 To make practical use of the HDX model, for example, comparing protein conformations at multiple conditions, the protection profiles of the studied protein from the 20 best solutions are averaged (in logarithm scale) to get a more clear view, as shown in Figure 5 (top) for the antibody heavy chain (0 M guanidine). Now, the level of conformational protection in each region of the protein is clearly presented. It can be seen that the HDX protection profile varies dramatically from residue to residue, indicating a unique folding structure of antibody heavy chain with distinct solvent accessibilities in different regions. Not surprisingly, in the protection profile, some nice correlation is observed between high protection peaks with expected β-sheet structure buried in the core of general antibody folding (marked by the central disulfide bonds). Such a correlation is more clearly presented by mapping the protection profile onto the common 3-D structure of an IgG1 Fc region, which shows the increased protection near the center of each β sheet (Supporting Information, Figure S-6). Guanidine Unfolding of IgG1. To further demonstrate the HDX platform, antibody denaturation by increasing guanidine concentration was analyzed by the platform. Antistreptavidin IgG1 conformations at four guanidine concentrations (0, 2, 3, and 4 M) were compared. The entire HDX data set includes 0% and 100% D controls and deuterium labeling at four conditions, having six labeling time points with a duplicate each. Similarly as in Figure 5 (top), the averaged protection profiles of antistreptavidin IgG1 heavy chain determined at 0, 2, 3, and 4 M guanidine are shown in Figure 5 (bottom). Also shown in the Supporting Information are the
Figure 5. The average value of all 20 possible protection factors for each residue in the heavy chain of antistreptavidin IgG1: (top) 0 M guanidine and (bottom) comparison of 0, 2, 3, and 4 M guanidine. The locations of disulfide bonds are shown.
differential plots of protection factors for different residues, showing the difference in protections among different conditions (Figures S-4 and S-5 in the Supporting Information). An overall trend shown in the HDX profiles is the progressive decrease in protection factors with increasing guanidine, consistent with more and more unfolded global conformation by guanidine. More importantly, comparison of the HDX protection profiles in Figure 5 (bottom) could also reveal a more detailed picture of how antistreptavidin IgG1 responded to the increasing guanidine concentration locally down to the domain or even single residue level. It can be clearly seen that the heavy chain CH2 domain starts to lose its native structure first at 2 M guanidine, while all the other domains remain intact. At 3 M guanidine, the CH2 domain almost completely unfolds, while partial denaturation is suggested in the Fab domain but mainly in the VH domain. The CH3 domain, which is generally believed as the most stable domain in IgG molecule, only starts to unfold at 4 M guanidine. Therefore, the denaturation process of antistreptavidin IgG1 characterized by our HDX analysis is in excellent agreement with our knowledge of antibody domain stability.40,41 It is also interesting to note that the effect of guanidine to the hinge region is minimal, probably because it is highly exposed even in its native structure. Supporting Information Figure S-6 shows the effect of guanidine mapped to the 3-D structure of an IgG1 Fc. A similar conclusion is made for the light chain (Supporting Information, Figure S-5), in which the variable domain is more readily denatured by guanidine than the constant domain.
■
CONCLUSIONS A new platform is developed to monitor structurally resolved protein HDX. The platform employs two types of internal 4948
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949
Analytical Chemistry
Article
(14) Zhang, H.-M.; Bou-Assaf, G. M.; Emmett, M. R.; Marshall, A. G. J. Am. Soc. Mass Spectrom. 2009, 20, 520−524. (15) Zhang, A.; Qi, W.; Singh, S. K.; Fernandez, E. J. Pharm. Res. 2011, 28, 1179−1193. (16) Houde, D.; Berkowitz, S. A.; Engen, J. R. J. Pharm. Sci. 2011, 100, 2071−2086. (17) Kipping, M.; Schierhorn, A. J. Mass Spectrom. 2003, 38, 271− 276. (18) Houde, D.; Arndt, J.; Domeier, W.; Berkowitz, S.; Engen, J. R. Anal. Chem. 2009, 81, 2644−2651. (19) Weis, D. D.; Engen, J. R.; Kass, I. J. J. Am. Soc. Mass Spectrom. 2006, 17, 1700−1703. (20) Pascal, B. D.; Chalmers, M. J.; Busby, S. A.; Mader, C. C.; Southern, M. R.; Tsinoremas, N. F.; Griffin, P. R. BMC Bioinf. 2007, 8, 156. (21) Pascal, B. D.; Chalmers, M. J.; Busby, S. A.; Griffin, P. R. J. Am. Soc. Mass Spectrom. 2009, 20, 601−610. (22) Slysz, G. W.; Baker, C. A. H.; Bozsa, B. M.; Dang, A.; Percy, A. J.; Bennett, M.; Schriemer, D. C. BMC Bioinf. 2009, 10, 162. (23) Kan, Z. Y.; Mayne, L.; Sevugan Chetty, P.; Englander, S. W. J. Am. Soc. Mass Spectrom. 2011, 22, 1906−1915. (24) Liu, S.; Liu, L.; Uzuner, U.; Zhou, X.; Gu, M.; Shi, W.; Zhang, Y.; Dai, S. Y.; Yuan, J. S. BMC Bioinform. 2011, 12 (Suppl 1), S43. (25) Hamuro, Y.; Anand, G. S.; Kim, J. S.; Juliano, C.; Stranz, D. D.; Taylor, S. S.; Woods, V. L., Jr. J. Mol. Biol. 2004, 340, 1185−1196. (26) Zhang, Z.; Li, W.; Logan, I. M.; Li, M.; Marshall, A. G. Protein Sci. 1997, 6, 2203−2217. (27) Althaus, E.; Canzar, S.; Ehrler, C.; Emmett, M. R.; Karrenbauer, A.; Marshall, A. G.; Meyer-Baese, A.; Tipton, J.; Zhang, H.-M. BMC Bioinf. 2010, 11, 424. (28) Zhang, Z. Anal. Chem. 2009, 81, 8354−8364. (29) Zhang, Z. J. Am. Soc. Mass Spectrom. 2012, 23, 764−772. (30) Zhang, Z.; McElvain, J. S. Anal. Chem. 1999, 71, 39−45. (31) Zhang, Z. Anal. Chem. 2004, 76, 3908−3922. (32) Zhang, Z. Anal. Chem. 2005, 77, 6364−6373. (33) Zhang, Z.; Shah, B. Anal. Chem. 2010, 82, 10194−10202. (34) Zhang, Z. Anal. Chem. 2011, 83, 8642−8651. (35) Molday, R.; Englander, S.; Kallen, R. Biochemistry 1972, 11, 150−158. (36) Bai, Y.; Milne, J. S.; Mayne, L.; Englander, S. W. Proteins: Struct., Funct., Bioinf. 1993, 17, 75−86. (37) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes; Cambridge Univ Press: Cambridge, U.K., 1986. (38) Rand, K. D.; Zehl, M.; Jensen, O. N.; Jorgensen, T. J. D. Anal. Chem. 2009, 81, 5577−5584. (39) Rand, K. D.; Pringle, S. D.; Morris, M.; Engen, J. R.; Brown, J. M. J. Am. Soc. Mass Spectrom. 2011, 22, 1784−1793. (40) Vermeer, A. W. P.; Norde, W. Biophys. J. 2000, 78, 394−404. (41) Ionescu, R. M.; Vlasak, J.; Price, C.; Kirchmeier, M. J. Pharm. Sci. 2008, 97, 1414−1426.
peptide standards to correct both back exchange variation from run to run and intrinsic exchange rates for multiple matrix conditions with the aim of minimizing random and systematic errors simultaneously. Therefore, the HDX platform is more robust and provides more valid results. Direct comparison of results obtained at different days or different experimental conditions is possible with these internal standards. Furthermore, the HDX platform implements a comprehensive HDX model that simulates the H/D exchange and back exchange process at the single residue level. Possible protection factors for each backbone amide hydrogen are automatically derived from the entire HDX MS data set. The entire data analysis process, including ion feature detection, peptide identification, and HDX modeling, is fully implemented into one in-house developed software MassAnalyzer and therefore enables fully unattended processing in a very efficient manner (within 1 day). With this fully automated approach, the efforts from users, as well as human errors, in comparing protein conformation between multiple states are minimized.
■
ASSOCIATED CONTENT
* Supporting Information S
Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. Fax: (805)376-2354. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS The authors would like to thank Peter Smith of Leap Technologies and Jason Richardson for their help in setting up and customizing the Leap PAL HDX system and Pavel Bondarenko for helpful discussions during the development of the methodology.
■
REFERENCES
(1) Englander, S. W.; Kallenbach, N. R. Q. Rev. Biophys. 1983, 16, 521−655. (2) Englander, S. W.; Mayne, L.; Bai, Y.; Sosnick, T. R. Protein Sci. 1997, 6, 1101−1109. (3) Englander, S. W. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 213−238. (4) Rosa, J. J.; Richards, F. M. J. Mol. Biol. 1979, 133, 399−416. (5) Englander, J. J.; Rogero, J. R.; Englander, S. W. Anal. Biochem. 1985, 147, 234−244. (6) Zhang, Z.; Smith, D. L. Protein Sci. 1993, 2, 522−531. (7) Smith, D. L.; Deng, Y.; Zhang, Z. J. Mass Spectrom. 1997, 32, 135−146. (8) Hoofnagle, A. N.; Resing, K. A.; Ahn, N. G. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 1−25. (9) Englander, J. J.; Del, M. C.; Li, W.; Englander, S. W.; Kim, J. S.; Stranz, D. D.; Hamuro, Y.; Woods, V. L., Jr. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 7057−7062. (10) Englander, S. W. J. Am. Soc. Mass Spectrom. 2006, 17, 1481− 1489. (11) Engen, J. R. Anal. Chem. 2009, 81, 7870−7875. (12) Konermann, L.; Pan, J.; Liu, Y.-H. Chem. Soc. Rev. 2011, 40, 1224−1234. (13) Wales, T. E.; Fadgen, K. E.; Gerhardt, G. C.; Engen, J. R. Anal. Chem. 2008, 80, 6815−6820. 4949
dx.doi.org/10.1021/ac300535r | Anal. Chem. 2012, 84, 4942−4949