Simulation and Analysis of Differential Scanning Calorimetry Output

Feb 1, 1997 - Ronald P. D'Amelia , Daniel Clark , and William Nirode. Journal of ... Ronald D'Amelia , Thomas Franks and William F. Nirode. Journal of...
0 downloads 0 Views 189KB Size
In the Laboratory

Simulation and Analysis of Differential Scanning Calorimetry Output: Protein Unfolding Studies 1 Babur Chowdhry School of Biological and Chemical Sciences, University of Greenwich, Wellington Street, Woolwich, London, SE18 6PF, UK Stephen Leharne* School of Environmental Sciences, University of Greenwich, Creek Road, Deptford, London, SE8 3BW, UK The use of high-sensitivity differential scanning calorimetric instrumentation to study biomolecular energetics is increasingly dealt with in the syllabuses of chemistry and biochemistry undergraduate and graduate degree programs. This coupled with expanding availability of instrumentation means that much of this taught material can be readily illustrated in the laboratory class. Certain aspects of DSC signal interpretation can, however, be problematic. For instance, questions of how the signal arises and selection of an appropriate baseline are not readily resolved in a practical fashion by considerations of experimental data. In this context simulations of output can be useful. In fact, simulations of instrumental output can serve a number of useful pedagogical functions. These include: • •





providing an appreciation of how chemical and physical processes give rise to instrumental signals; providing an appreciation of how “real” processes may be related to theory learned in class. This may be especially pertinent for aspects of physical chemistry such as thermodynamics; permitting increasingly complex signals to be produced while allowing the student to maintain an overall grasp of the underlying phenomena. This is not easily accomplished with practical work, where the student is often given a sample to analyze and is then normally told what various aspects of the signal may indicate. Simulation and analysis may even allow the student to build up expertise in data interpretation; finally, acquainting students with the underlying mathematical description of the chemistry, which provides them with the elements of model building and allows a discussion of the principles of model fitting.

For some time our research group has focused its activities on the application of high-sensitivity scanning calorimetry to investigate thermal phenomena in a variety of aqueous macromolecular systems, including protein and peptide solutions. From this has developed an interest in teaching data interpretation to advanced undergraduate and graduate students. Furthermore, since calorimetric data do not usually yield any molecular information it becomes crucial for developing a deeper understanding to show how molecular processes may produce the signals observed. It has become customary for us to teach these aspects of the subject using protein unfolding as our exemplar system. High-sensitivity differential scanning calorimetry is used to investigate the energetics of ther-

mal unfolding of dilute aqueous protein solutions, in order, it is to be hoped, to provide fundamental insights into the important processes responsible for protein folding. The question of how each protein acquires and maintains its folded conformation remains an important physicochemical problem. The purpose of this article is to outline the major features of our treatment and give some indication of the information to be obtained from DSC signals. For further useful and informative material the reader is referred to works by Chowdhry (1), Sturtevant (2), and Privalov (3). The Thermodynamics of Protein Unfolding From the outset it is necessary to state clearly that the series of classes are solely concerned with protein unfolding or denaturation, which is under strict thermodynamic control. That is there are no kinetic limitations. In these circumstances protein unfolding is readily represented by the following equilibrium expression:

K=

N

(1)

where K is the equilibrium constant for unfolding and [N] represents the concentration of native form and [U] the concentration of unfolded form of the protein. The purpose of scanning calorimetry is to investigate enthalpy changes in systems as a function of changing temperature. This is normally carried out by measuring the power required to keep a sample at the same temperature as a reference material as the temperature of both is increased in a linear fashion. For protein systems, the sample would normally contain protein in buffer and the reference material is normally the buffer solution. As the temperature alters, the protein unfolds in a cooperative fashion. Unfolding arises from the destruction of the numerous small forces that maintain the native protein structure. Clearly the disruption of these forces alters the enthalpy of the system, giving rise to a drop in temperature because the unfolding process is usually endothermic. The scanning calorimeter will provide energy to the sample to maintain its temperature at the same value as that of the reference as the overall temperature rises. This energy input is measured as a power input. Thus the raw output of a scanning calorimeter is a data set of power vs. temperature. Power is readily converted to the apparent molar excess heat capacity using the following equation:

*Corresponding author. FAX number +44 181 316 8205.

236

U

Journal of Chemical Education • Vol. 74 No. 2 February 1997

P φC p,xs = σm

(2)

In the Laboratory where φC p,xs is the apparent excess heat capacity in J mol–1 K –1 (φ referring to the apparent quantity, subscript p to constant pressure, and subscript xs to the excess quantity), P is power in J s-1 , σ is scan rate in K s–1, and m is the number of moles of protein in the sample cell in mol. In aqueous protein systems the apparent excess heat capacity arises from the change in the enthalpy of the system with temperature. Additional contributions to the heat capacity are made by the native and unfolded protein forms. The latter contributions are significant only if the two heat capacities are different. Initially we shall ignore these latter contributions. Thus the contribution made by protein unfolding to the apparent excess heat capacity is given by:

φC p,xs = d α∆H dT

(3)

where ∆H is the enthalpy change of protein unfolding and α is the extent of conversion, which is expressed as the fraction of total protein in the unfolded form. Accordingly, changes in the apparent excess heat capacity arise from the fractional enthalpy changes that characterize the unfolding process as the temperature is raised. If ∆H is independent of temperature, then eq 3 may be written as:

φC p,xs = ∆H dα dT

(4)

Evaluation of the derivative dα/dT is crucial to the DSC simulation and is done in the following way. The equilibrium expression in eq 1 is readily expressed in terms of α:

K=

α 1–α

(5)

Rearrangement produces the following expression for α:

α=

K 1+K

(6)

Evidently any change in the equilibrium constant will produce a change in the extent of reaction. Temperature changes provide the necessary driving force for variations in the equilibrium constant, the relationship being given by the van’t Hoff isochore, which is conveniently expressed in its integral form as:

K (T2) = K (T1) e

∆H 1 – 1 R T1 T2

(7)

This equation may be used to calculate the equilibrium constant at any temperature, T, by setting T1 to T 1/2 [the temperature at which α = 0.5 and thus K(T1/2) = 1] and T 2 to T. Normally, for protein solutions, which exhibit two-state thermodynamically reversible behavior, T1/2 is the same as T m, the temperature at which the apparent excess heat capacity is a maximum. The derivative could therefore be obtained using a combination of eqs 6 and 7. It is, however, advantageous at this stage to introduce a numerical method for obtaining the derivative which proves to be extremely useful for more complex systems. Essentially, a small increment in temperature, δT, is specified whose value is ordinarily very small. We use a value of 5×10 –6. The derivative at any particular temperature, T, is then obtained by the expression:

dα ≈ α(T + δT ) – α(T – δT ) = dT (T +δT ) – (T – δT) α(T + δT ) – α(T – δT ) 2δT

(8)

where α(T + δT) is the extent of unfolding at temperature T + δT. This function is readily evaluated from a combination of eqs 6 and 7. The apparent excess heat capacity at any temperature, T, is then given by:

φC p,xs ≈

α(T + δT ) – α(T – δT ) ∆H 2δT

(9)

Classroom simulations of the output are normally carried out using the mathematics software, Mathcad. However for individual student exercises a spreadsheet such as Excel or Quattro Pro is used. Figure 1 shows the kind of output obtained using eq 9. An example Quattro Pro worksheet is also shown in worksheet 1. The simulation of φCp,xs is accomplished by filling column A with temperature values (T) between 273 and 373 K. In the next two columns the equilibrium constant is calculated using eq 7. T1 is set equal to T1/2 and T 2 is set equal to T + δT in column B and T –␣ δT in column C. In columns D and E, the corresponding values of α are calculated using eq 6. Finally, φCp,xs is calculated in column F using eq 9. The spreadsheet formulas used are shown in worksheet 1. The worksheet reveals the way in which the equilibrium constant changes with temperature and how α alters correspondingly. Examination of Figure 1 reveals a number of interesting points. First, the simulations produce symmetrical curves that clearly bear out the earlier statement that T 1/2 may be identified with Tm, the temperature at which C p,xs is a maximum. Increasing the enthalpy change for the process produces a narrower peak. This is not surprising, since the van’t Hoff isochore readily indicates that the rate of change of the equilibrium constant with temperature is directly proportional to the enthalpy change observed for the system. Thus the larger the value for ∆H the greater the rate of change in the equilibrium constant with temperature and consequently the narrower the temperature range over which the process starts and finishes. Output of the type shown in Figure 1 is very rarely, if ever, encountered in practical DSC. Indeed, one would anticipate that a system that is essentially measuring the temperature dependence of heat capacity would contain other heat-capacity contributions to the overall signal. For proteins, such contributions should also come from the heat capacities of the native and unfolded forms. If the heat capacities of the initial and final states are different, we may conclude that the enthalpy of the unfolding process is temperature-dependent. Such temperature dependence is expressed by the Kirchoff equation:

∆H(T ) = ∆H(Tm) + ∆C p(T – Tm)

(10)

which may be combined with the van’t Hoff isochore and then integrated to produce the following expression for the equilibrium constant K(T):

Vol. 74 No. 2 February 1997 • Journal of Chemical Education

237

In the Laboratory

K (T ) = e

∆H 1 – 1 + ∆C p(T) ln T + Tm – 1 R Tm T R Tm T

(11)

It should be noted that the heat capacity change itself may also be temperature-dependent. Equation 6 is used to compute α from the pertinent value for K(T). Once α is obtained, φCp,xs is readily calculated. Since the various heat capacity contributions are additive the apparent excess heat capacity now becomes:

(a)

(b)

C p,xs(T ) = ∆H(Tm) + ∆C p(T ) (T –Tm) δα + 2δT (12) αC pU (T ) + (1 – α) C p N (T ) CpU(T) is the heat capacity of the unfolded protein form at temperature T and CpN(T) is the heat capacity of the native form at temperature T. As the extent of conversion changes, the contributions of the two protein forms to the overall heat capacity term also change. The changes in heat capacity ∆Cp used in the Kirchoff equation (see eq 10) is of course equal to CpU(T) – Cp N(T). Simulations obtained using eq 12 are shown in Figure 2. The most important aspect of the simulated output is the existence of a C p step. This step, normally referred to as ∆ DCp, represents the difference between the heat capacities of the native and unfolded forms of the protein and is visible as a difference in height between the pretransitional and posttransitional portions of the curve. This behavior is typical of proteins that undergo thermodynamically reversible unfolding. Importantly, ∆DCp is always encountered as a positive increment. The unfolding of the peptide ubiquitin, shown in Figure 3, illustrates how closely the DSC examination of protein unfolding resembles the simulated output, indicating that the mathematical treatment contains the essential elements of the processes under investigation. The usual explanation given for the positive value for ∆D Cp is that protein unfolding exposes hydrophobic groups to the aqueous solvent. This exposure causes an increase in water structure, which manifests itself as an increase in hydrogen bonding. The enthalpy release associated with hydrogen bonding thus increases the heat capacity of the system. The existence of this increment has a number of important consequences. Firstly, any process that causes the transition temperature T m to alter will automatically bring about a change in the enthalpy of unfolding. If T m increases then so will ∆H. The second important consequence of the existence of ∆D Cp is that it predicts the existence of a second lower temperature transition known as the “cold denaturation” temperature (4). This phenomenon is so named because thermodynamic analysis indicates that the process must actually involve unfolding or denaturation on cooling. Figure 4 shows simulated output in which cold denaturation is encountered. Data Analysis

Figure 1. Differential Scanning Calorimetry Output. (a) Quattro Pro worksheet. (b) Plot of apparent excess heat capacity as a function of temperature for two simulated DSC processes using eq 9. The Tm for both processes was set at 300 K. The values for ∆H are indicated on the plot.

238

Simulation of DSC output also enables students to appreciate important aspects of data analysis. If the student group can grasp how the key thermodynamic parameters of the processes give rise to the signal, then it becomes relatively straightforward to understand how real data ought to be analyzed in order to derive these parameters. The DSC trace in Figure 3 comprises the unfolding

Journal of Chemical Education • Vol. 74 No. 2 February 1997

In the Laboratory

Figure 2. Plot of apparent excess heat capacity as a function of temperature for a simulated DSC process using equation 12. The peak was created by setting ∆H to 200 kJ mol{1 and Tm to 300 K. The baseline was created using the following expressions for the heat capacities of the native and unfolded forms of the protein: CpN = 0.5 + T × 10{3 and CpU = 10 {2 × T.

Figure 3. DSC trace obtained for the peptide ubiquitin under the following conditions: concentration 5 g dm{3 in glycine–HCl buffer at pH 2.45.

transition obtained for ubiquitin, which represents the contribution of the first term in eq 12. In addition, there is an underlying change that represents the contributions of the second and third terms. These latter changes provide a changing baseline. The object of data analysis of real DSC traces is to provide numerical values for the parameters outlined in eq 12. This necessitates finding a value for T m, ∆H, and ∆DCp. Tm and ∆D Cp are readily obtained from the simulated or real traces. ∆DCp is the heat capacity increment obtained by extrapolating the pretransitional and posttransitional portions of the trace to T m, the temperature at which φCp,xs is a maximum. ∆D Cp is the difference between the two extrapolated values. To obtain a value for ∆H it is necessary to subtract the second and third terms in eq 3. This means establishing an appropriate baseline for the trace and subtracting it from the data. In Figure 2, the underlying baseline is shown. Clearly it would be extremely useful to be able to produce a similar baseline for a DSC trace armed only with clearly delineated pretransitional and posttransitional portions of the baseline. From a pedagogical point of view a good deal of time is spent investigating this aspect of data analysis. Usually this involves producing simulated data for student use. The students are then expected to mathematically construct a baseline between what is judged to be the beginning and the end of the transition. The baseline data set that is prepared consists of the following data subsets: (i) data points comprising the pretransitional portion of the simulated trace; (ii) data points computed from the mathematically constructed baseline (the temperature values of the data points in this region are used in an appropriate equation, which describes the baseline, to generate underlying φCp,xs values); and (iii) data points comprising the posttransitional portion of the simulated trace. This baseline data set is subtracted from the simulated data set and the resultant data set is then numerically integrated using the trapezoidal method. The total area measured under the peak should be very close to the value of ∆H used to simulate the data.

It normally comes as a surprise when students are encouraged to investigate the assumption that the baseline is merely a straight line drawn between what is judged to be the beginning and the end of the transition. It is often found that very little error is introduced in this way. The students are encouraged to investigate the magnitude of errors introduced using this and other baseline options such as cubic splines, and polynomial fitting. Figure 5 shows two baselines fitted to the simulated data. One is straight line the other is a cubic polynomial that has been computed using the pretransitional and posttransitional portions of the baseline. This aspect of the analysis is readily accomplished on any spreadsheet that allows multiple regression. Once the computed baseline is subtracted, the resultant peak may be integrated, the product of the integration exercise being the transition enthalpy, ∆H. Integration of the area under the curve for both plots gives values close to 200 kJ mol –1, the value used for ∆H in the simulations in the first place. Information about the ∆H is also contained within the dimensional aspects of the trace itself. This provides an alternative method for the calculation of ∆H. Consider:

ln K = ln

α 1–α

(13)

Differentiation of ln K with respect to α produces

∂ln K = 1 + 1 α 1–α ∂α

(14)

Combination of eqs 4 and 14 with the van’t Hoff isochore produces: ∂ln K ∂α

T

dα = φC p,xs ⋅ 1 = ∆H ∆H dT α (1 – α) RT 2

(15)

When the process is half complete α = 0.5, T = Tm, and φCp,xs = Cp,m, the maximum value of φCp,xs. Inserting this condition into eq 15 produced the following expression:

Vol. 74 No. 2 February 1997 • Journal of Chemical Education

239

In the Laboratory

(a)

Figure 4. Simulated plot of cold denaturation obtained by setting ∆H to 200 kJ mol{1, Tm to 350 K, and ∆DCp to 10 kJ mol {1. The underlying baseline changes are also shown. These were computed as α∆DCp.

∆H =

4RT 2mC p,m

(16)

Another important aspect of data analysis involves fitting the trace to a mathematical model of the unfolding process. Equation 9 provides the mathematical model of the unfolding process followed by DSC. It is this equation that is used for the model fitting. For class use a trace is generated using eq 9. Additionally, in order to produce a trace that appears more real, the random number function on the spreadsheet or Mathcad is used to produce suitably scaled minor fluctuations in the simulated signal. Students are presented with this data set of φCp,xs against temperature. On the spreadsheet they can then produce a simulated trace using initial estimates of ∆H and Tm, which are placed in an easily accessible part of the spreadsheet. These will be obtained by a preliminary analysis of the data. A column containing the square of the residuals between the φCp,xs data provided and the simulated data is produced and the sum of the squares of the residuals is then entered into an easily accessible cell. Using the Optimizer of Quattro Pro or the Solver of Excel, the values of ∆H and Tm may be varied so that sum of the squares of the differences approaches as close as possible to zero. The model-fitting exercise should provide values for ∆H and T m. Values obtained in this way afford some satisfaction, since the entire data set has been used to obtain these estimates. Concluding Remarks The object of this article has been to outline the thermodynamic basis for the DSC signals obtained for the equilibrium unfolding of a protein. Equations have been generated which enable DSC simulations to be produced. The fundamental aspects of data analysis have also been outlined. This material forms the basis of a series of classes for advanced undergraduate and graduate students. In a subsequent article it is our intention to outline the thermodynamic and mathematical descriptions of more complex protein unfolding processes and how these may also be analyzed.

240

(b)

Figure 5. Baseline fitting for the plot shown in Figure 2. (a) The result of assuming that the baseline is a straight line. For this diagram a judgment was made about where the onset and finish of the transition occurred (shown as points A and B on the figure). These coordinates were used to generate a straight-line equation. Once the parameters of the line were calculated, the temperature values of the data points between A and B were used to calculate apparent excess heat capacity values, which constituted the baseline data set. These values were then subtracted from the simulated trace. The apparent excess heat capacity values of the data points before A and after B (which constitute the pretransitional and posttransitional portions of the simulation) were set to zero. (b) For this plot it was assumed that the pretransitional and posttransitional portions are capable of being fitted to a cubic polynomial expression. Using the regression option on a spreadsheet, the parameters of the cubic expression were obtained. These were then used in exactly the same way as in (a).

Literature Cited 1. 2. 3. 4.

Chowdhry, B. Z.; Cole, S. C. Trends Biotechnol. 1989, 7, 11. Sturtevant, J. Annu. Rev. Phys. Chem. 1987, 38, 463. Privalov, P: Annu. Rev. Biophys. Biophys. Chem., 1989, 18, 47. Privalov, P. Crit. Rev. Biochem. Mol. Biol. 1990, 25, 281.

Journal of Chemical Education • Vol. 74 No. 2 February 1997