Quantitating the Statistical Distribution of Deuterium Incorporation To

Therefore adding one deuterium to the control envelope (i.e., first degree of ...... Bottom-up hydrogen deuterium exchange mass spectrometry: data ana...
0 downloads 0 Views 159KB Size
Anal. Chem. 2006, 78, 207-214

Quantitating the Statistical Distribution of Deuterium Incorporation To Extend the Utility of H/D Exchange MS Data John K. Chik, Jaclyn L. Vande Graaf, and David C. Schriemer*

Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada

Measuring the statistical distribution of deuterium incorporated into enzymatically derived peptide fragments provides a valuable dimension to hydrogen/deuterium exchange mass spectrometry data. In this paper, we will discuss our improvement to the linear least-squares method for determining this distribution, through the addition of “zeroes” to the end of the deuterated isotopic envelope, to partially compensate for data truncation due to finite instrumental signal-to-noise ratios. The value of the distribution is demonstrated in a simple experimental example, where the linearity between average deuteration and percent D2O used to label test peptides hides a more complex relationship between the site-labeling probability and the total number of sites. This method offers the opportunity to resolve cases where a single peptide experiences distinct, independent biochemical states with each bearing a unique average deuteration; this can occur when a protein is modified to substoichiometric levels. From the experimentally determined distribution of a heterogeneously deuterated peptide, it was possible to extract the average deuteration of each component of the mixture. The ability of mass spectrometry (MS) to follow changes in protein and peptide mass as amide hydrogens are exchanged for deuteriums opens new vistas for understanding biological macromolecules through hydrogen/deuterium exchange (HDX) methods.1-3 Unlike HDX NMR, which can in principle track exchange at individual amides, hydrogen/deuterium exchange mass spectrometry (HDX-MS) localizes deuteration sites by monitoring mass changes of enzymatically derived peptide fragments. This enzymatic cleavage is carried out under “quench” conditions (pH ∼2-3 and ∼0 °C) that dramatically reduce HDX rates and thus preserves protein deuteration levels and localization prior to digestion. The most important experimental parameter in HDX MS is the average deuteration of each enzymatically derived peptide fragment. Changes in average deuteration as a function of labeling time, biochemical context, or both reveal much about the function * To whom correspondence should be addressed. Phone: (403) 210-3811. Fax: (403) 270-0834. E-mail: [email protected]. (1) Busenlehner, L. S.; Armstrong, R. N. Arch. Biochem. Biophys. 2005, 433, 34-46. (2) Smith, D. L. Biochemistry (Moscow) 1998, 63, 285-293. (3) Smith, D. L.; Deng, Y.; Zhang, Z. J. Mass Spectrom. 1997, 32, 135-146. 10.1021/ac050988l CCC: $33.50 Published on Web 11/16/2005

© 2006 American Chemical Society

of that peptide fragment within the entire protein or complex. The standard method for calculating the average deuteration is to compute the centroid mass of the deuterated peptide’s isotopic envelope and subtract the nondeuterated envelope centroid.4 Alternatively, we and others have used the nondeuterated, isotopic envelope to deconvolute the statistical distribution of deuterium incorporation from the deuterated envelope.5-8 From this distribution, it is easy to calculate the average deuteration, which we refer to as the distribution average. These two methods should generate the same value. In this paper, we describe an improved linear least-squares method for determining the distribution of deuteration. The conventional least-squares methods have been previous described by Zhang and Smith6 and Zhang et al.7 Our implementation consists of a suite of scripts,9 which takes the output peak list from the deuterated sample data along with nondeuterated control data to calculate the deuterium distribution. We have incorporated into our linear least-squares method the ability to pad the deuterated data with “zeros” in order to partially compensate for missing data resulting from a finite instrumental signal-to-noise ratio (S/N). To experimentally validate our method, we compare the measured distribution of deuterium incorporation of “statically” labeled peptides against the theoretically expected binomial behavior.10 In this binomial limit, peptides are described as having N exchangeable hydrogens each with identical labeling probability p. The average number of incorporated deuteriums, µ, is

µ ) Np

(1)

The standard deviation of the distribution is σ ) (Npq)1/2 where q ) 1 - p, which when combined with eq 1 can be expressed as

σ ) xNpq ) xNp(1 - p) ) xµ(1 - µ/N)

(2)

The final expression in eq 2 represents a family of curves, (4) Mandell, J. G.; Falick, A. M.; Komives, E. A. Anal. Chem. 1998, 70, 39873995. (5) Chik, J. K.; Schriemer, D. C. J. Mol. Biol. 2003, 334, 373-385. (6) Zhang, Z.; Smith, D. L. Protein Sci. 1993, 2, 522-531. (7) Zhang, Z.; Guan, S.; Marshall, A. G. J. Am. Soc. Mass Spectrom. 1997, 8, 659-670. (8) Jorgensen, T. J.; Gardsvoll, H.; Ploug, M.; Roepstorff, P. J. Am. Chem. Soc. 2005, 127, 2785-2793. (9) Lutz, M.; Ascher, D. Learning Python, 2nd ed.; O’Reilly: Sebastopol, 2004.

Analytical Chemistry, Vol. 78, No. 1, January 1, 2006 207

parametrized by N, relating the standard deviation of deuterium incorporation to the average. This simple binomial limit is realized through “static” labeling where peptides are equilibrated against a known percentage of D2O. The MS of these labeled peptides are quickly measured under quenched conditions in order to minimize back-exchange (the reexchange of deuteriums for hydrogens in the quenching solution). In this limit, the labeling probability is the same for all the amides and equal to the equilibrating percentage of D2O used. For a single peptide at different levels of static labeling, the standard deviation versus the average deuteration should fall along one of the curves described in eq 2. The N of the best match curve would be a good estimate of the number of exchange sites (amide hydrogens) in the peptide. In this paper, we first test the performance of our method against theoretical data to examine the effects of data thresholding and data padding. In the process of experimentally validating the method against statically labeled peptides, the deuterium distribution reveals a surprisingly complex relationship between the average deuteration and the percentage of labeling D2O used. This is obscured beneath the observed linearity between average deuteration and percentage D2O used. Finally, we use our method to extract the underlying average deuteration from the deuterium distribution of a mixture composed of identical peptides with different levels of deuteration. EXPERIMENTAL METHODS Software: An Overview. The CalcDeut suite consists of three Python9 scripts that are run in series. The first script (ms2xml.py) translates the peak list file into an eXtensible Markup Language (XML) formatted file. This XML peak list is processed by a script (CollectPeaks.py) that groups individual peaks into isotopic envelopes using various user-supplied parameters such as charge, threshold, and mass error. Envelopes consist of a series of peaks at the correct m/z separation (within the user-selected mass error) for the given charge, which are greater than the supplied threshold. The list of envelopes produced by this script, along with the list from the nondeuterated control data, is then processed by a third script (CalcDist.py), which produces the final output containing the distribution of deuterium incorporation and average deuteration for each peptide. In addition to these three main scripts, the CalcDeut suite also includes a number of scripts primarily to convert the final output back into flat text files and provide a basic statistical analysis of the calculated deuterium distribution. The files can be downloaded for use from http:// www.sams.ucalgary.ca/research/downloads/. Calculating Peptide Fragment Distribution of Deuteration. Our approach to calculating the distribution of deuteration is graphically represented in Figure 1 and explained below. Let the nondeuterated peptide isotopic envelope (the control envelope) be represented as

Figure 1. Graphical representation of determining the deuterium distribution using our linear least-squares method (see text for explanation of symbols). Here, the deuterated envelope is the sum of three separate degrees of deuteration, a4, a5, and a6 determined by simply shifting the nondeuterated envelope, a0. The five peaks in the deuterated envelope constrain the scaling factors x4, x5, and x6. Adding a “zero peak” at ∆m/z ) 9 (i.e., relative to the monoisotopic peak) enables the use of an additional padding envelope, a7.

ai the intensity or area of the ith peak, and M the number of peaks in the nondeuterated isotopic envelope. In this notation, the monoisotopic peak is represented by a0e0. The mass increment of adding a deuterium to a peptide is essentially equal to the difference in mass between adjacent isotopic peaks within an envelope at a fixed charge (i.e., mass difference between C13 C12 ≈ H2 - H1). Therefore adding one deuterium to the control envelope (i.e., first degree of deuteration, a1) can be represented by M-1

a1 ) a0e1 + a1e2 + a2e3 + ... )

∑a e

i i+1

i)0

In general, the jth degree of deuteration can be represented as M-1

aj )

∑a e

i i+j

i)0

M-1

a0 ) a0e0 + a1e1 + a2e2 + ... )

∑a e

i i

i)0

where ei is a set of orthonormal unit vectors (i.e., ei‚ej ) δij) representing an individual isotopic mass peak within the envelope, (10) Snedecor, G. W.; Cochran, W. G. Statistical Methods, 6th ed.; The Iowa State Univeristy Press: Ames, IA, 1972.

208

Analytical Chemistry, Vol. 78, No. 1, January 1, 2006

Let y represent the deuterated isotopic envelope for the same peptide represented by a0. The deuterated isotopic envelope can be expressed as N-1

y)

∑y

k+Sek+S

k)0

∑x )1 i

number of prolines in the test peptides changes the peptide mass without changing the number of exchangeable hydrogens. Static Labeling of Peptides. Stock peptide solutions were made by dissolving each peptide in deionized water (18 MΩ) at a concentration of 10 mg/ml. The peptides were then mixed in an approximate 1:1:1 ratio. This mixture was statically labeled by mixing 1 µL of stock peptide into 20 µL of D2O/H2O and equilibrated for a minimum of 24 h. Inclusion of 10 mM sodium acetate buffer (pH 5.5) had no effect on the outcome and was omitted from later trials. For MS, the deuterated peptides were “quenched” by mixing 1 µL with 200 µL of 50% acetonitrile with 0.05% trifluoroacetic acid (TFA) (pH ∼2) on ice. To minimize backexchange, ∼0.3 µL of the quenched sample was quickly spotted onto a chilled MALDI plate (-15 °C) where an R-cyano-4hydroxycinnamic acid (CHCA) layer (20 mg/mL in 80% acetone/ 20% methanol) had previously been spotted and dried. Without drying, the plate was loaded into the MALDI mass spectrometer (Voyager DE STR Applied Biosystems) where the instrument vacuum-dried the spot. Mass spectra represent the accumulation of 410 shots (18-kV accelerating voltage with a 65% grid voltage and 200-ns extraction delay). The time between quench and initiation of data collection was ∼3 min with data collection lasting an additional 2-3 min. Heterogeneously Labeled Peptides. Heterogeneously deuterated samples were formed by rapidly quenching and mixing different labeled solutions of peptide 2. Statically labeled peptide 2 was made by diluting 1 µL of a stock solution (10 mg/mL in H2O) into 20 µL of 0, 25, and 50% D2O to give a final percentage D2O of 0, 23.8, and 47.6%, respectively. After a minimum 24 h of equilibration, 1 µL of 0 and 23.8% labeled solutions was mixed with 400 µL of 50% acetonitrile/0.05% TFA (pH ∼2). A similar sample was also made using 23.8 and 47.6% D2O. As in the previous static labeling experiments, 0.3 µL of the mixture was rapidly spotted on top of a cold, dry layer of CHCA and then dried within the MALDI spectrometer. Data collection proceeded as described above. The resulting deuterated isotopic envelope was deconvoluted using the CalcDeut suite, and the deuterium distribution was then fitted as two binomial distributions. Theoretical Studies. To test the robustness of the linear leastsquares method of deconvoluting, the consequences of thresholding data and control sets, and the beneficial effects of “padding”, theoretically deuterated isotopic envelopes were constructed using the nondeuterated isotopic envelope of the nonapeptide 1 as the control (Table 1). Three theoretical deuterated isotopic envelopes were constructed by convoluting the nondeuterated envelope with an N ) 8 binomial distribution for probabilities of 0.2, 0.5, and 0.8. The envelopes were scaled such that the base peak in the envelope was normalized to 1.0 and all other peaks scaled against it. To simulate the effect of limited S/N, peaks from the calculated deuterated and control envelopes were subjected to different cutoffs (i.e., 0.001, 0.01, and 0.1), which reduced the number of peaks used in fitting (Table 1).

Peptide Model Systems. Peptides used in this study were NSSFGVVIR, NSSFGPVVIR, and NSSFGPPPVVIR (peptides 1, 2, and 3, respectively); these were synthesized by BioSyn Inc. to ∼70% purity and used without further purification. The peptides differ only by the number of prolines incorporated. Varying the

RESULTS AND DISCUSSION Theoretical Studies: Effect of Padding and Thresholding on Least-Squares Fitting. Standard applications of peak selection algorithms in mass spectrometry incorporate a user-defined threshold. When determining the distribution of deuteration using the formalism described above, and statistical figures from this

where S is the first observed degree of deuteration in the deuterated envelope and N the number of peaks. It is also possible to express y as a linear combination of aj where the scalar variables, xj, represent the population at the jth degree of deuteration. N-1

y)



N-M

yk+Sek+S )

k)0



N-M

xi+Sai+S )

i)0



M-1

xi+S

i)0

∑a e

j j+i+S

(3)

j)0

The number of unknowns xi to be fitted, N - M + 1, is smaller than the number of inputs or constraints N and suggests that the linear least-squares method could be used to arrive at best fit values for xi and, hence, the determination of the deuteration distribution. It is important to note that implicitly included in the S, N, and M variables are the notions of limited instrument sensitivity. Unlimited sensitivity leads to both N and M being essentially as large as one wishes and S ) 0. Experimentally and practically, S g 0 and N and M are finite. A consequence of eq 3 is that the “last” peak in the deuterated envelope yN-1+SeN-1+S, which is the weakest, is solely determined by xN-M+SaM-1eN-1+S, which is also the weakest peak in the aN-M+S envelope. The consequence is that the last deuterated peak disproportionately weights xN-M+S for the entire aN-M+S envelope, thus perturbing the deuterium distribution. Adding additional unpopulated unit vectors (“padding”) to the high m/z end of the deuterated envelope should greatly improve the deconvolution of deuteration distribution by compensating for data truncation. Adding Z degrees of padding to eq 3 leads to N-1

y)



N+Z

yk+Sek+S + 0

k)0

∑e

k+S

)

k)N N-M

∑ i)0

M-1

xi+S



N-M+Z

ajej+i+S +

j)0



i)N-M+1

M-1

xi+S

∑a e

j j+i+S

(4)

j)0

The extra terms resulting from the added zeroes are underscored by the double line in eq 4. Padding the deuterated data increases the number of fitting parameters, xi, and should improve the fit between the data and the deconvoluted control isotopic envelopes, ai, as it acknowledges truncation of the higher degrees of deuteration resulting from finite S/N and the application of thresholds. Since the intensity of these added peaks is set to zero, this constraint should prevent the extra xi from undergoing large oscillations while still allowing higher degrees of deuteration for fitting. Finally, as an additional constraint, normalizing the deuterated envelope and the nondeuterated control envelope prior to least-squares fitting should result in

Analytical Chemistry, Vol. 78, No. 1, January 1, 2006

209

Table 1. Theoretical Peak List for Non-deuterated and Deuterated Isotopic Envelopes for Peptide 1 ∆m/zb 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

nondeuterated envelope 1.000 000 0.538 953 0.168 547 0.038 665 0.007 152 0.001 121 0.000 153 0.000 019 0.000 002

deuterated envelopesa p ) 0.2 p ) 0.5 p ) 0.8 0.330 000 0.847 319 1.000 000 0.732 175 0.375 264 0.144 379 0.043 765 0.010 849 0.002 266 0.000 408 0.000 065 0.00 0 009 0.000 001 1.10 × 10-7 8.44 × 10-9 4.23 × 10-10 1.02 × 10-11

0.009 504 0.081 156 0.308 697 0.688 839 1.000 000 0.991 343 0.687 666 0.339 004 0.121 311 0.0328 66 0.007 154 0.001 305 0.000 205 0.000 028 0.000 003 3.33 × 10-7 1.90 × 10-8

0.000 005 0.000 160 0.002 290 0.018 846 0.098 040 0.332 683 0.731 823 1.000 000 0.776 599 0.306 383 0.083 639 0.017 755 0.003 121 0.000 471 0.000 063 0.000 007 6.45 × 10-7

a Theoretically deuterated isotopic envelopes at labeling probabilities of p ) 0.2, 0.5, and 0.8 and N ) 8. b m/z relative to monoisotopic peak (Z ) 1).

distribution, thresholds can have a strongly negative impact on the quality of the measurements. Applying a least-squares fitting routine using the theoretical nondeuterated envelope thresholded to 0.001 (6 isotopic peaks, Table 1), and fitting it to the deuterated data set thresholded to 0.1 (7 isotopic peaks, Table 1) results in only two degrees of freedom, a poor fit of the data (Figure 2a), and an inaccurate measure of average peptide deuteration. When padding is incorporated, there is a significant improvement in the distribution-calculated average deuteration and standard deviation. This is shown in Table 2 for a range of thresholds; the scenario described above is represented by entry B. In this underdetermined situation, both the distribution-calculated average and the deuterium distribution as measured by its standard deviation differ significantly from both the theoretical and centroid average (theoretical average deuteration 4.00, centroid average 4.03, and distribution average 2.83). With three degrees of padding, the distribution average is 4.00, in agreement with the theoretical average deuteration. The conventional centroid average is robust in the face of deuterated and nondeuterated data thresholding (Table 2). At the other extreme of thresholding where the problem is overdetermined, the padding has little effect. Reversing the thresholds of the previous example results in a situation where the deuterated envelope is represented by 12 data peaks (p ) 0.5) and the nondeuterated envelope by only 3 peaks (Table 2, entry A). The distribution average of deuteration is relatively insensitive to the degree of padding in this situation, and the calculated degree of deuteration offers a good fit to the theoretical distribution (Figure 2b). The fitting produced a finite population for the ninth degree of deuteration even at zero padding, even though the deuterated data were generated using only eight exchangeable sites. This should exclude the existence of that degree of deuteration. Fortunately, this extra deuteration state has a very low population and does not affect the calculated averages or standard deviation (Table 2 entry A). Additional deuteration states resulting from padding the data were negligibly 210 Analytical Chemistry, Vol. 78, No. 1, January 1, 2006

Figure 2. Effect of padding on measuring the degrees of deuteration. Hashed bars represent the theoretical population of each degree of deuteration assuming p ) 0.5 and N ) 8. Symbols represent degrees of padding (2, 0; 4, 1; 9, 2; 0, 3). (a) The beneficial effects of padding the data set, where the degrees of freedom are low due to a high data set threshold. The envelope of the theoretical data set was thresholded to 0.1 and the reference data thresholded to 0.001. (b) The benign effects of padding the data set, where the degrees of freedom are high. The envelope of the theoretical data set was thresholded to 0.001 and the reference data thresholded to 0.1.

populated and well behaved. Only with heavy padding did a small, negative occupancy result (Figure 2b, inset). Table 2 presents similar conclusions using two different probabilities of deuterium incorporation (p ) 0.2, 0.8). On the basis of this theoretical study, we selected two degrees of padding to fit the experimental deuterated peptide data sets described below. We believe this represents a good compromise between improving the calculated distribution and minimizing the introduction of spurious, nonphysical degrees of deuteration. In the following sections we show the utility of calculating the distribution of deuterium incorporation. Static Deuteration of Peptides. The expected binomial behavior of statically labeled peptides provides an opportunity to experimentally validate our deconvolution method using peptides 1-3. As these peptides present identical numbers of exchangeable hydrogens, this allowed us to average the results from all peptides at a given percent deuteration level and show that centroid and distribution average deuterations were the same, within error (Table 3). Linearity was observed between the average deuteration (distribution-calculated or centroid) and the fractional amount of D2O used to label the peptides (Figure 3 inset). The best linear fit yielded a slope of 8.213 and intercept of -0.016. This implies that, at 100% deuteration, the average deuteration should be 8.2, which is essentially the number of backbone amides. Naively, this suggests that there is no significant

Table 2. Effect of Data Truncation and Padding threshold

distribution average pad ) 1 pad ) 2

p µ(σ)a

datab

control

centroid average

pad ) 0c

0.2 1.60(1.13)

0.001(9) 0.001(9) 0.001(9) 0.01(8) 0.01(8) 0.01(8) 0.1(6) 0.1(6) 0.1(6) 0.001(12) 0.001(12) 0.001(12) 0.01(9) 0.01(9) 0.01(9) 0.1(7) 0.1(7) 0.1(7) 0.001(11) 0.001(11) 0.001(11) 0.01(9) 0.01(9) 0.01(9) 0.1(5) 0.1(5) 0.1(5)

0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3) 0.001(6) 0.01(4) 0.1(3)

1.60 1.62 1.67 1.60 1.61 1.66 1.53 1.55 1.60 4.00 4.02 4.07 4.00 4.02 4.07 4.03 4.05 4.11 6.40 6.42 6.47 6.40 6.42 6.47 6.41 6.43 6.48

1.48(0.98) 1.61(1.14) 1.67(1.20) 1.24(0.76) 1.58(1.09) 1.66(1.18) N/A 1.24(0.76) 1.50(0.99) 3.92(1.32) 4.02(1.43) 4.07(1.47) 3.31(0.89) 3.93(1.31) 4.05(1.41) 2.83(0.38) 3.79(1.05) 4.03(1.24) 6.17(0.96) 6.41(1.14) 6.47(1.20) 5.65(0.68) 6.41(1.12) 6.45(1.16) N/A 5.80(0.40) 6.30(0.78)

0.5 4.00(1.41) Ad

Bd 0.8 6.40(1.13)

1.57(1.09) 1.62(1.15) 1.67(1.21) 1.48(0.98) 1.61(1.14) 1.67(1.20) 0.80(0.40) 1.48(0.98) 1.61(1.11) 3.99(1.40) 4.02(1.43) 4.07(1.48) 3.71(1.14) 4.0(1.39) 4.07(1.48) 3.39(0.77) 4.00(1.23) 4.11(1.33) 6.40(1.13) 6.41(1.15) 6.47(1.21) 6.17(0.95) 6.41(1.13) 6.47(1.20) 5.00(NA) 6.29(0.78) 6.53(0.98)

pad ) 3

1.60(1.12) 1.62(1.16) 1.67(1.21) 1.57(1.09) 1.61(1.15) 1.67(1.20) 1.24(0.76) 1.57(1.09) 1.63(1.14) 4.00(1.41) 4.02(1.44) 4.07(1.48) 3.93(1.31) 4.02(1.41) 4.08(1.45) 3.79(1.05) 4.07(1.31) 4.12(1.35) 6.40(1.13) 6.42(1.16) 6.47(1.20) 6.40(1.12) 6.41(1.14) 6.48(1.20) 5.80(0.40) 6.51(0.97) 6.51(0.96)

1.60(1.13) 1.62(1.16) 1.67(1.21) 1.60(1.12) 1.61(1.15) 1.67(1.19) 1.48(0.98) 1.58(1.09) 1.59(1.05) 4.00(1.41) 4.02(1.43) 4.07(1.48) 4.00(1.38) 4.02(1.41) 4.07(1.44) 4.00(1.23) 4.07(1.31) 4.09(1.29) 6.40(1.13) 6.42(1.16) 6.47(1.21) 6.40(1.12) 6.42(1.14) 6.47(1.18) 6.29(0.78) 6.47(0.93) 6.4(0.84)

a Probability, and “theoretical” average(standard deviation). b Threshold(number of peaks). c Average(standard deviation). d Entries A and B refer to text.

Table 3. Deuterium Incorporation and Deuterium Labeling Probabilites from Static D2O Labeling of Peptides 1-3 averagea

pb

deuterationc

centroid

distribution

N)8

N)9

N ) 10

0.9524 0.7143 0.4762 0.2381 0.0952 0.0476

7.79 ( 0.16(16) 6.06 ( 0.09(10) 3.90 ( 0.24(18) 1.92 ( 0.10(15) 0.78 ( 0.05(14) 0.35 ( 0.06(17)

7.74 ( 0.16 6.03 ( 0.10 3.87 ( 0.22 1.90 ( 0.10 0.80 ( 0.06 0.36 ( 0.05

0.8808(7.579) 0.7099(2.296) 0.4733(0.465) 0.2390(0.053) 0.1001(0.0476) 0.0476(0.080)

0.8248(0.841) 0.6453(1.120) 0.4241(0.230) 0.2135(0.077) 0.0894(0.0432) 0.0424(0.080)

0.7532(0.613) 0.5842(1.065) 0.3839(0.235) 0.1929(0.120) 0.0807(0.042) 0.0382(0.081)

a Average deuterium incorporation ( standard deviation(#observations). b Labeling probabilities(reduced χ2ssum of squares/number of degrees of freedom) for different values of N, using eq 5. c Represents the fractional amount of D2O in equilibrating solution.

back-exchange in the ∼3-4 min between quench and data collection. The standard deviation of the distribution reveals a more complex situation underneath this simple linear relationship and highlights the usefulness of our deconvolution approach. Plotted along with the experimental data in Figure 3 is the “ideal” (equilibrium) binomial relationship between the average and standard deviation of a binomial distribution for N sites of deuteration (eq 2). Up to an average of four deuteriums, the data from the peptides scatter around the N ) 8 curve, as expected. However, at higher levels of deuteration (i.e., >5), the data deviate from this curve, suggesting that additional exchangeable states are populated. Since the average deuteration is the labeling probability, p, multiplied by the number of sites N (eq 1), the observed linearity between the average deuteration and the labeling level (Figure 3 inset) suggests a coincidence, where the lower labeling probability may be compensated by a larger number of sites. In addition to the eight amide backbone sites, the peptides

possess fast-exchanging hydrogens from the arginine side chain and the N- and C-termini.11 It is possible that, at high labeling levels, a sufficient number of these fast-exchanging sites present detectable deuteration levels during the time between quenching and data collection. This phenomenon, of reduced labeling probability despite an average deuteration that agrees with the composition of D2O in the equilibrating solution, was explored in greater detail. The experimentally determined deuterium distributions in Figure 4 can be fit to single-binomial distributions with N’s, according to eq 5, where I(d) is the “intensity” of the dth degree of deuteration

I(d) )

[

]

N! pd(1 - p)N-d (N - d)!d!

(5)

and p the labeling probability. Figure 4a shows the deuterium (11) Bai, Y.; Milne, J. S.; Mayne, L.; Englander, S. W. Proteins 1993, 17, 7586.

Analytical Chemistry, Vol. 78, No. 1, January 1, 2006

211

Figure 3. Standard deviation (σ) of the distribution of deuterium incorporation as a function of the average deuteration (µ) for the three peptides (4 1, b 2, 0 3). The smooth curves show the “ideal” relationship between µ and σ (eq 2) as a function of the number of identical exchangeable sites, N. Inset: the average deuteration as a function of fractional D2O labeling levels. The straight line represents the best linear fit yielding slope of 8.213 and intercept of -0.016.

Figure 5. Standard deviation (σ) as a function of the average deuteration (µ) for heterogeneously deuterated peptide 2 compared with homogeneously labeled peptide 2. The mixture consisting of 0 and 23.8% deuteration is shown as 9 in box a. Box b highlights the 23.8 and 47.6% mixture. The homogeneous static results of peptide 2 from Figure 3 are plotted as 0. The smooth lines represent eq 2 with indicated N’s.

Figure 4. Fitting the experimentally determined deuteration distribution for peptides 1, 2, and 3 to a binomial distribution, at different percent D2O incorporation. (a) 95.2% D2O and (b) 47.6% D2O. Binomial fitting with N ) 8 (dashed), N ) 9 (solid), and N ) 10 (dotted). For clarity, only the results of binomial fitting for N ) 8 (dashed) and N ) 9 (solid line) are shown in (b). The inset plots the fitted p versus the degree of D2O labeling for N ) 10. Distribution data shown as hashed bars ( standard deviation.

probabilities resulting from these binomial fits to the experimental data are shown in Table 3. At low average deuteration, the curves as given by eq 2 merge into one another. This implies that improvements in fitting the distribution as a function of N would be less dramatic than that seen for high levels of deuteration. The consistently best-fit values for the calculated labeling probability arise from N ) 10; these are plotted in Figure 4b (inset) against percent deuterium labeling, as in Figure 3 (inset). The dotted line represents the maximum deuterium incorporation possible; therefore, the deviation between the dotted and the best-fit labeling probabilities represents the actual back-exchange in these experiments. This simple example highlights the potential importance of calculating the distribution of deuterium incorporation, as the appearance of back-exchange can readily go unnoticed when measuring only the average deuteration. Heterogeneous Deuterated Peptides. Plotting the experimental standard deviation of deuteration σ versus the average deuteration µ, for a peptide in a mixture representing more than one labeling state, should highlight heterogeneity in labeling through a greater than expected standard deviation. Figure 5 demonstrates a plot σ versus µ for peptide 2 labeled at different percentages of D2O and mixed in equal amounts, relative to the control where the same peptide was labeled at one percentage of D2O. In this example, a standard deviation greater than the control value is sufficient reason to suspect heterogeneous labeling, as is clearly demonstrated. Here, the distribution of deuterium incorporation calculated from the experimental data using our padding approach, was fit to a double binomial distribution, eq 6, where I(d) is the intensity

[

I(d) ) A1 distribution at 95.2% D2O labeling averaged over all three peptides. Clearly, the N ) 8 binomial fits this distribution poorly, whereas increasing N improves the fit to the distribution. Figure 4b shows the deuterium distribution at the 47.6% deuterium labeling level fitted to N ) 8 and N ) 9 binomials. The deuterium labeling 212

Analytical Chemistry, Vol. 78, No. 1, January 1, 2006

]

N! pd(1 - p1)N-d + (N - d)!d! 1 N! pd(1 - p2)N-d (6) A2 (N - d)!d! 2

[

]

of the dth degree of deuteration, pi the labeling probability of the

Table 4. Summary of Double-Binomial Distribution Fit to Deuterium Distributions in Figure 5 centroida

a

double-bionomial fit results

distributionb

sample

µ

µ

σ

A1

p1

A2

p2

0 + 23.8% 23.8 + 47.6%

1.06 ( 0.17 2.93 ( 0.04

1.06 ( 0.15 2.93 ( 0.06

1.36 ( 0.08 1.68 ( 0.03

0.53 0.53

0.0076 0.24

0.48 0.48

0.28 0.50

Centroid average. b Distribution average and standard deviation of the distribution.

Figure 6. Deconvoluting mixed deuteration of peptide 2. (a) Mixture representing 0 and 23.8% deuteration, and a sample experimental MS is shown in the inset. The distribution was fitted with two N ) 8 binomial distributions (solid lines and b). (b) Mixture representing 23.8 and 47.6% deuteration, and a sample experimental MS is shown in the inset. The distribution was fitted with two N ) 8 binomial distributions (solid lines and b). The dashed lines represent the two components of the double-binomial fit. The dotted line represents an attempted fitting using a single-binomial distribution. Distribution data shown as hashed bars ( standard deviation.

ith component of the binary sample, and Ai the relative amount of the ith component of the binary mixture. The accurate selection of N at levels of deuteration