Two-Component Polypeptides Modeled with Effective Pair Potentials

Static structure of sodium polystyrene sulfonate solutions obtained through a coarse-grained model. Damián Jacinto-Méndez , Mario Villada-Balbuena ,...
1 downloads 0 Views 109KB Size
24728

J. Phys. Chem. B 2006, 110, 24728-24733

Two-Component Polypeptides Modeled with Effective Pair Potentials P. Pliego-Pastrana†,‡ and M. D. Carbajal-Tinoco*,† Departamento de Fı´sica, Centro de InVestigacio´ n y de Estudios AVanzados del IPN, Apartado Postal 14-740, 07000 Me´ xico D.F., Mexico, and Department of Chemistry, UniVersity of California, Berkeley, California 94720-1460 ReceiVed: June 19, 2006; In Final Form: August 23, 2006

We present Monte Carlo simulations performed within a model based on a set of distance-dependent effective potentials which are used to describe the interactions between a pair of distinct amino acids. These effective potentials are extracted from experimental correlation functions through the Ornstein-Zernike equations and adequate closure approximations. We focus our attention on the sequences of two specific residues, namely, alanine and glycine. The studied sequences are (a) (Ala)12-(Gly)4-(Ala)12 and (b) three interacting chains of alternating alanines and glycines (with five residues per chain). The resulting structures are combinations of known secondary structures. More importantly, we verify that our simulated structures are in thermodynamic equilibrium by means of an estimation of the density of states.

I. Introduction The prediction of the three-dimensional structure of a protein based only on the sequence of amino acids and the properties of the surrounding medium has been a challenge for several decades.1 Moreover, to satisfy the requirements of future industrial or pharmaceutical applications, this prediction must be both efficient and accurate. In the literature, there is a considerable number of rather different approaches intended to describe how a protein folds. For example, a simplified model consists of two types of monomers (hydrophobic and hydrophilic) in a square lattice. This computationally tractable model allows for the complete enumeration of all possible configurations of short chains. Although unrealistic, this model suggests the important concept of the energy landscape.2 On the other hand, highly precise models include the interactions between all of the atoms of the protein and the effect of an explicit solvent. In a recent publication, Jayachandran et al.3 reported an extensive study of the folding trajectories of a protein with 36 residues (villin). The molecular dynamics study of 500 µs, however, required a huge array of computers (a subset of the 200 000 CPUs of Folding@Home).3 It is thus desirable to combine the advantages of the two previous proposals. In this paper, we develop a scheme intended to minimize the number of variables while still keeping the most relevant physical information. First, it is obvious that limiting the number of parameters in a model is a straightforward manner to reduce the roughness of the energy landscape. Moreover, the information incorporated into our model is a set of interaction energies that are derived from experimental correlation functions. Other successful approaches have included experimental information that can be either discrete (contact potentials) or continuous, as in this work (see, for example, ref 4 for a recent review). In its present form, our model is the two-component generalization of a previous work in which we were able to reproduce two secondary motifs of a one-component system (polyalanines).5 In both cases, we extracted one or more effective pair potentials * To whom correspondence should be addressed. † Centro de Investigacio ´ n y de Estudios Avanzados del IPN. ‡ University of California.

(EPPs) from a series of crystallographic structures of proteins. The only dependence of our EPPs is the separation between the centroids of the pairs of residues. Then, they were used as the main input parameters of a computer simulation. Since the EPPs are knowledge-based, they must have the averaged information of the solvent, the dissolved ions, and the neighbor amino acids. It is also important to mention that models based on EPPs are expected to reproduce protein structures only for conditions that are identical to the original experimental ones, such as being in the bulk with a pH of 7 ( 1, in our case. For instance, the environment surrounding an alanine inside a protein can be considered to be hydrophobic, to some extent. In other words, our model is not suitable to describe the behavior of polyalanines in aqueous solution, which is still the subject of intense debate in the literature.6,7 Our model, however, predicts the formation of R-helices for a single chain of bulk-like polyalanines, and we found, in actual proteins, some sequences of polyalanines (located far from the ends) with the R-helical conformation.5 On the other hand, Soto et al.8 recently found that the polyalanine monomer preferentially adopts a β-hairpin structure in a fully hydrophobic, nonpolar solvent (cyclohexane). An important number of knowledge-based models consider the characteristics of the contact values between residues to construct empirical potentials.4 Alternatively, the main assumption of our approach is to suppose that a protein behaves like a liquid of amino acids in thermodynamic equilibrium and this crucial approximation permits the obtainment of EPPs. Our EPP is defined as the distance-dependent interaction energy between two residues that reproduces their corresponding pair correlation function.9,10 The effective interaction is obtained from experimental correlation functions by means of the Ornstein-Zernike (OZ) equations together with appropriate closure approximations. II. Pair Correlation Functions The aim of this work is to provide a method to estimate the effective interactions between two residues of an arbitrary kind. This general case is especially important since it is the basis to construct our model. For instance, here we focus our attention

10.1021/jp0638179 CCC: $33.50 © 2006 American Chemical Society Published on Web 11/11/2006

Two-Component Polypeptides

J. Phys. Chem. B, Vol. 110, No. 48, 2006 24729

on the properties of two specific residues, although the method could be utilized to study other combinations of biologically relevant amino acids. We are interested in the structural properties of two amino acids that are frequently found in proteins, that is, alanine and glycine, which are denoted as 1 and 2, respectively. The three radial distribution functions (g11(r), g12(r), and g22(r)) describing this protein subsystem were obtained from a series of 196 nonhomologous proteins from the Protein Data Bank (PDB). Our selection includes hydrolases, oxidoreductases, atpases, and GroELs, among others. More importantly, the chosen proteins are of high molecular weight, with each one containing at least 2000 amino acids. We selected such structures provided that only systems with a large number of elements have a significant reduction of fluctuations and eventually they can attain thermodynamic equilibrium. The pair correlation functions are determined as explained below. For a given protein and using the coordinates of the atoms (excluding hydrogen atoms), we find the positions of the centers of mass of the alanines and glycines that belong to it. These positions permit the calculation of the corresponding radial distribution functions, according to the general expression,11

gγµ(r) )

1 FNχγχµ

Nγ Nµ

∑ ∑δ(r - (ri - rj))〉 i)1 j)1



(1)

where the indices γ and µ refer to species 1 (Ala) or 2 (Gly). The number of particles of species γ is denoted as Nγ, and N ) N1 + N2. The angular parentheses denote an ensemble average, while ri is the position of the centroid of residue i, and δ(r) is Dirac’s delta. The number density is F ) N/V with V being the total volume, and χγ ) Nγ/N. Equation 1, however, is valid only for the case of infinite systems. In our case, it is necessary to add a correction term that takes into account the effect of the finite size of proteins (see the Appendix). We thus extracted the radial distribution functions gγµ(r) of each of the 196 proteins under analysis. Each protein can be viewed as a distinct configuration of a system. We also take advantage of the following fact. In a previous study,10 we verified, within a good approximation, that the correlation functions between pairs of alanines are independent of variations in the number density and the same behavior is observed for the case of other residues (including the pairs alanine-glycine and glycine-glycine). This important property allows us to perform the average of radial distribution functions characterized by rather different number densities. The averaged functions then have a significant decrease in statistical noise.10 We should mention that these radial distribution functions do not contain any explicit angular information and that some extra information has to be provided, for example, to take into account chirality. In Figure 1, we present the averaged correlation functions g11(r), g12(r), and g22(r), having mean number densities Fj1 ) (4.3 ( 1.4) × 10-4 and Fj2 ) (3.6 ( 1.2) × 10-4 Å-3. Notice that the error bars of the these radial distribution functions are smaller than the size of the symbols used to plot g12(r). The curves of Figure 1 have the following relevant features. Although the molecules of alanine and glycine are very similar in size and have a similar number of atoms (instead of an atom of H, alanines have a group CH3), their corresponding radial distribution functions are clearly different. While the function g11(r) has three relatively sharp and high peaks, g22(r) shows only two wider and shorter peaks. The curve of g12(r) is closer in shape to g22(r), but it still has some resemblance to g11(r). Moreover, since the positions of the maxima are directly related

Figure 1. Radial distribution functions gγµ(r) between the centers of mass of two alanines (dotted line), two glycines (continuous line), and the pair alanine-glycine (symbols). Each case is the result of an average of 196 different radial distribution functions, with each one being the characteristic correlation function of a specific protein.

to the geometric attributes of known structural motifs,5,10 it is not surprising that alanines and glycines are usually found in different types of secondary structures.12 For example, for the case of glycines, the positions of the maxima of the two peaks are ma ) 3.4 ( 0.1 and mb ) 4.9 ( 0.1 Å. The first peak corresponds to the probability of finding two consecutive glycines in the chain, and the second one is related to the formation of secondary structures. Let us consider a triangle formed with these characteristic distances (two sides of length ma and the other one of length mb). As a result, the two identical sides make an angle of θ ) (92 ( 6)°, which is a smaller value than the 100° required to form perfect R-helices. Of course, glycines are commonly found in turns and β-sheets.12 From these correlation functions, we can extract their corresponding EPPs. III. Effective Pair Potentials Let us suppose that the microstructure of the amino acids inside a protein is determined by the OZ equations for M species,11 M

hγµ(r) ) cγµ(r) +

∑Fjν∫cγν(r′)hνµ(|r - r′|) dr′ ν)1

(2)

with hγµ(r) ) gγµ(r) - 1 and cγµ(r) being the total and direct correlation functions, respectively. The Fourier space version of eq 2 is an algebraic equation that allows the determination of c˜ γµ(q) as a function of the experimental h˜ γµ(q). The Fourier space functions c˜ γµ(q) are then transformed back to get cγµ(r). It is necessary to provide additional conditions relating the eff (r) to the correlation funceffective interaction energies βuγµ tions, that is,11

βueff γµ(r) ) hγµ(r) - cγµ(r) - ln(hγµ(r) + 1) + bγµ(r) (3) where bγµ(r) are the bridge functions, β ) (kBT)-1, with T being the absolute temperature, and kB is Boltzmann’s constant. In general, bridge functions do not have analytical expressions and they have to be approximated with certain closure relations. For example, bγµ(r) ) 0 is known as the hypernetted closure (HNC) and bγµ(r) ) cγµ(r) - hγµ(r) + ln(hγµ(r) - cγµ(r) + 1) leads to the Percus-Yevick (PY) approximation.11 In this work, we find identical results with both closure relations and the same compatibility was also found in a previous work.10

24730 J. Phys. Chem. B, Vol. 110, No. 48, 2006

Pliego-Pastrana and Carbajal-Tinoco

At this point, there are at least two possible routes to extract eff (r) and both seem to be equivalent. First, let us recall that βuγµ we obtained the effective potential between pairs of alanines from the one-component version (M ) 1) of eqs 2 and 3 (see Figure 2).10 For the case of glycines, we follow the same procedure to determine βueff 22 (r), which is also plotted in Figure 2. To get the crossed potential alanine-glycine, it is necessary to study, in eq 2, the case M ) 2. The first alternative to compute βueff 12 (r) is based on the use of a known direct correlation function in the OZ equation that contains the crossed terms, that is,

c˜ 12(q) )

h˜ 12(q) - Fj1c˜ 11(q) h˜ 22(q) 1 + Fj2h˜ 22(q)

(4)

where c˜ 11(q) comes from the previous calculation of the onecomponent case (route a). On the other hand, the three coupled OZ equations can be solved from the very beginning (route b); thus,

c˜ 12(q) ) h˜ 12(q) 1 + Fj1h˜ 11(q) + Fj2h˜ 22(q) + Fj1Fj2[h˜ 11(q) h˜ 22(q) - (h˜ 12(q))2] (5) In Figure 2, we also present the potentials βueff 12 (r) calculated through eqs 4 and 5. As it can be observed, the two EPPs (denoted as Ala-Gly (a) and Ala-Gly (b)) are very similar. This comparison indicates that the analyzed OZ equations are practically decoupled, despite the highly structured shape of the experimental functions gγµ(r). This result suggests again a low dependence of the number density on the structures of the residues that belong to a protein. We would like to mention that van der Vaart et al.13 first pointed out that the many-body effects are not relevant to the process of protein folding. eff Otherwise, the potentials βueff 12 (r) and βu22 (r) have qualitative similarities between them and they are clearly different from β ueff 11 (r) (see Figure 2), as in the case of their corresponding radial distribution functions. The continuous versions of these eff three EPPs, βuγµ (r), are employed to construct a model of polypeptide folding. IV. Dressed Polymer Model In this paper, we generalize a model of protein folding that was introduced elsewhere.5 The case of different species of amino acids is the most general one. However, we included a certain number of restrictions intended to facilitate the calculations. In the following text, we describe the salient features of the dressed polymer model (DPM). An initial approximation is done at the level of the chain backbone. Since the first potential well (see Figure 2) is related to the confinement of two consecutive residues in the main chain, we replaced this well by a thin rigid segment of length aγµ. The ends of the segment thus represent the positions of the centers of mass of two residues of species γ and µ. In our simulations, we take a11 ) a12 ) a22 ) 3.4 Å. The rodlike segments are freely jointed to other segments, and the interaction energy between the amino acids belonging to distinct segments is given by continuous curves that were adjusted to the experimental EPPs.14 For the EPP of alanines, here we use the same approximations that were explained in our previous paper (like the suppression of the third well for the case of single chains).5

Figure 2. Effective pair potential between two alanines (diamonds), two glycines (squares), and the pair alanine-glycine (circles, route a; triangles, route b; as explained in the text). To enhance clarity, βueff 11 (r) and βueff 22 (r) are presented with offsets of 6kBT and 3kBT, respectively. The two continuous lines represent the DPM. The thick line is the separation aγµ between the centers of two amino acids, and the thin line is a Pade´-like fit14 of the experimental EPPs. Notice that, instead of the first potential well, we add an infinite barrier from 0 to =4.2 Å.

We have performed Monte Carlo (MC) simulations within the DPM. To reach the eventual configuration of minimum free energy, we implemented additional strategies to facilitate the convergence of the algorithm. For example, the polypeptide is grown in a progressive way (as it occurs in the ribosome), adding segments one by one and giving a certain number of MC steps between the addition of new segments. Each segment is represented by a vector, and the chain is, thus, the sum of these vectors. Moreover, a MC step consists of randomly moving an aleatory number of segments and then re-forming the chain in a new configuration that has an energy to be tested. This drastic MC move enhances sampling and increases the efficiency of walks in energy space.15 We use the algorithm of Metropolis16 with minimization;17 in other words, the probability to accept a trial move is given by,

Pacc ) min[1,exp(-∆βEjk)]

(6)

where ∆βEjk ) βEk - βEj is the change in potential energy between a test conformation k and the previously accepted one j, which is chosen to be a local minimum. Our model requires another ingredient. Since EPPs depend only on the distance, it is necessary to recover chirality by other means. It is possible to neglect undesirable configurations through the cross product of the vectors that describe adjacent segments. The vector associated with the pairs of residues (i - 1, i) is vi ) ri - ri-1, and here ri is the position of the amino acid i (|vi| ) aγµ). We define a vector ci+1 ) vi × vi+1 and a scalar s ) ci+1‚ci+2. Thus, configurations with the same chirality have a positive value of s. The implementation of this small algorithm, however, leads to either left or right proteins. We recall that the use of these EPPs is limited to a fixed ionic strength and a unique temperature. Nevertheless, it is possible to deduce some thermodynamic properties from the density of states, Ω(βE), and for this purpose we implemented the algorithm of Wang and Landau,18 which was first used by Rathore and de Pablo15 for the case of proteins. This algorithm is based on a random walk in energy space. Starting from the structures obtained in the initial part of the MC simulation, new configurations are generated (with the procedure already

Two-Component Polypeptides

J. Phys. Chem. B, Vol. 110, No. 48, 2006 24731

Figure 3. Some of the resulting structures from our simulations. They are identified as follows:20 For the sequence (Ala)12-(Gly)4-(Ala)12, I(a), two R-helices and one turn; I(b), one R-helix and two turns. For the case of three interacting chains, each sequence consists of alternating alanines and glycines; II(a), a β-strand; II(b), three turns. All structures were drawn in Rasmol.

described) and the transition probability from energy level βEj to βEk is given by

Ptran ) min[1,Ω(βEj)/Ω(βEk)]

(7)

If the move is accepted, the density of states is updated by multiplying the current value by a modification factor f > 1, that is, Ω(βEk) f Ω(βEk)f (the initial values of Ω(βE) are set to unity). If the move is refused, then Ω(βEj) is multiplied by the factor f. A histogram of energies is also generated. In each trial move, the histogram is updated until it becomes locally flat. At this point, the factor f is reduced as fnew ) (fold)1/2, and a new cycle is started. The condition of detailed balance is satisfied when f = 1. The density of states is actually obtained from the average of various (and noninteracting) chain configurations. V. Results and Discussion From the density of states Ω(βE), and considering a volume and a number of fixed particles, we can compute the average energy,19

βE h eff )

ΣβEβEΩ(βE) exp(-βE) ΣβEΩ(βE) exp(-βE)

(8)

and Helmholtz’s free energy,

Ω(βE) exp(-βE)] ∑ βE

βAeff ) -ln[

(9)

and the entropy emerges from a combination of the two previous equations, that is, Seff/kB ) βE h eff - βAeff. We would like to emphasize that, in general, thermodynamic quantities derived from effective potentials differ from the experimental ones because, by definition, an effective interaction represents a partial description of the whole system. On the other hand, these quantities are useful to characterize the stability of the system. Within the DPM, we simulated two small proteins which are combinations of alanines and glycines. The first polypeptide (denoted as I) has 28 residues, and it consists of the following linear sequence: (Ala)12-(Gly)4-(Ala)12. The second polypeptide (identified as II) has 15 residues and is formed by three consecutive chains, each one made of five alternating residues, for example, Ala-Gly-Ala-Gly-Ala (this arrangement is typical of spider silk12). In Figure 3, we show some of the polypeptide

Figure 4. Density of states of two polypeptides with a different number of alanines and glycines and two distinct arrangements: (a) a single chain of sequence (Ala)12-(Gly)4-(Ala)12 and (b) three interacting chains, each consisting of five alternating alanines and glycines. The inset of each figureshows the folded structures obtained in the corresponding sharp peaks.

shapes obtained as a result of our simulations. Although all of these structures can be retrieved in regular proteins, only a comparison of their thermodynamic properties reveals additional details. Within our simulations, only the polypeptides denoted as I(b) and II(b) represent configurations of minimum free energy which, in principle, should be in their native state. In Figure 4a and b, we present the density of states of arrangements I and II, respectively. Both curves have clear similarities and some differences too. In both cases, the most remarkable feature is the sharp peak located at a low value of the energy. This peak is crucial to define the characteristics of the thermodynamic properties. After a gap of about 2kBT, a smooth and shorter peak develops with a distinct shape for each system. Let us mention that other folded structures have a density of states with same general form.5 For polypeptide I, using eq 9, we obtain βAeff ) -132.0, which is the minimum free energy for this system. Moreover, by means of eq 8, the resulting average energy is βE h eff ) -91.2 and the entropy is thus S/kB ) 40.8. It is interesting to notice that, in comparison with the entropy, the internal energy has a more important contribution to the Helmholtz’s free energy. This result is in agreement with the general idea of a properly folded protein, in which the energetic contribution dominates over the entropic one.21 We find, however, a nonnegligible value of the entropy. We recall that Karplus et al.22 first pointed out that the entropy of the native state cannot be neglected. Arrangement II is characterized by a rather compact structure (see Figure 3, polypeptide II(b)). It is identified as having three

24732 J. Phys. Chem. B, Vol. 110, No. 48, 2006 parallel turns. Its corresponding thermodynamic parameters are βE h eff ) -49.9, Aeff ) -91.5, and S/kB ) 41.6. It can be noticed that, despite the smaller number of residues, this system has a similar entropy to that of the first polypeptide in this study. This effect is related to the high number of possible configurations of this arrangement. Also important, the stability of these polypeptides is related to the average energy per residue, βE h eff/N. For configurations I(b) and II(b), we find an identical value of βE h eff/N = -3.3. Finally, in Figure 3, we show two computed configurations (I(a) and II(a)) that correspond to combinations of known secondary structures. For instance, polypeptide I(a) consists of two R-helices and one turn, while II(a) is a β-strand. Although these two configurations are taken into account in the global statistics, they are marginally stable structures. VI. Conclusions We presented a series of folded polypeptides resulting from a MC simulation. Our model has a reduced number of variables, and its main ingredient is the use of EPPs extracted from experimental correlation functions that were obtained from the crystallographic data of the PDB. In this process, we assumed that a protein behaves like a liquid of amino acids, thus neglecting the connectivity of the monomers. Despite the relative crudeness of this ansatz, the resulting potentials seem to capture the most relevant characteristics of the interaction between alanines and glycines. Then, we simulated two polypeptides formed with sequences of these two residues and we verified, using the density of states, that some of the studied structures are in thermodynamic equilibrium. We also found a richer variety of structures in comparison with the one-component case.5 Although the simulated structures presented here are not necessarily observed in nature, they show the capability of this method to reproduce realistic features of actual protein structures. A more general version of our model should include, for example, a variable separation between two consecutive residues. In other words, the replacement of the rigid segments for the original potential wells would allow us to study the flexibility of the polypeptides.23,24 More importantly, we introduced the most general case of interaction between two types of amino acids. For instance, here we studied the two smallest amino acids with the previously discussed results. Of course, other combinations of residues have to be investigated to fully validate our model. Let us finally mention that, regardless of the relatively large number of EPPs required to describe an arbitrary protein, this approach could be an alternative to indirectly take into account subtle effects such as dissolved ions or even water molecules. Despite recent advances, certain properties of water, such as the dielectric constant, still represent a challenge in numerical simulations.3,25 VII. Appendix Here, we describe a normalization procedure that provides bulk-like properties from systems of a finite size but large enough to extract a structural or thermodynamic property. For the case of residues belonging to a protein, let us first define a test sphere of volume V ) 4πr3max/3 that contains one or two types of amino acids of species γ and µ, with at least 50 residues of each type that are homogeneously distributed inside the test sphere. For a given distance r, the radial distribution gγµ(r) is calculated through the following equation:

Pliego-Pastrana and Carbajal-Tinoco

gγµ(r) )

(

h′(r) 1 2 Fχγχµ N4πr dr - N′V (r) c

)

with h′(r) being the total number of residues between two concentric spheres of radii r and r + dr about a central residue. If the central residue is found inside the sphere of radius rmax - r, then Vc(r) ) 0 (region I).19 For a central residue located outside region I but still inside the test sphere (region II),

[

(

Vc(r) ) π (r2max - r2) ln 1 -

)

]

r 3 + r2 + rrmax dr rmax 2

where N′ is the number of particles found in region II. The function Vc(r) is thus a correction to the effect of finite size. Acknowledgment. The authors want to thank P. Gonza´lezMozuelos and J. L. Arauz-Lara for helpful conversations. This work was supported by CONACyT under Grants SEP-2004C01-47200 and SEP-2005-C01-49486. References and Notes (1) Protein Folding; Creighton, Ed.; Freeman: New York, 1992. (2) Dinner, A. R.; Sali, A.; Smith, L. J.; Dobson, C. M.; Karplus, M. Trends Biochem. Sci. 2000, 25, 331. (3) Jayachandran, G.; Vishal, V.; Pande, V. S. J. Chem. Phys. 2006, 124, 164902. (4) Skolnick, J. Curr. Opin. Struct. Biol. 2006, 16, 166. (5) Pliego-Pastrana, P.; Carbajal-Tinoco, M. D. J. Chem. Phys. 2005, 122, 244908. (6) Peng, Y.; Hansmann, U. H. E. Biophys. J. 2002, 82, 3269. (7) Sorin, E. J.; Rhee, Y. M.; Shirts, M. R.; Pande, V. S. J. Mol. Biol. 2006, 356, 248. (8) Soto, P.; Baumketner, A.; Shea, J.-E. J. Chem. Phys. 2006, 124, 134908. (9) Gonza´lez-Mozuelos, P.; Carbajal-Tinoco, M. D. J. Chem. Phys. 1998, 109, 11074. (10) Pliego-Pastrana, P.; Carbajal-Tinoco, M. D. Phys. ReV. E 2003, 68, 011903. (11) Hansen, J.-P.; McDonald, I. R. Theory of Simple Liquids; Academic Press: London, 1986. (12) Voet, D.; Judith, G.; Voet, J. G. Biochemistry, 2nd ed.; John Wiley & Sons: New York, 1995. (13) van der Vaart, A.; Bursulaya, B. D.; Brooks, C. L., III; Merz, K. M., Jr. J. Phys. Chem. B 2000, 104, 9554. (14) Here we present the Pade´-like fits of the EPPs shown in Figure 2, 2 i.e. βueff 11 (r): (71.8431603669 - 71.59981991319r + 28.419434069516r - 5.61758994025896r3 + 0.5531567721102r4 - 2.1712961791641 × 10-2r5)/(1-0.63234171043667r+0.13215223847368r2 -9.144941349811213 × 10-3r3) if 4.2 e 5.8, (-0.1914536502217 + 8.740623504811386 × 10-2r - 1.321669150589 × 10-2r2 + 6.616714677139 × 10-4r3)/(1 0.674629914068r + 0.16866674915598r2 - 1.85470238494498 × 10-2r3 + 7.574752411305 × 10-4r4) if 5.8 e r < 7.4, (0.713186804978 0.16501840618396r + 9.5215112347057 × 10-3r2)/(1 - 0.24014633304067r -3 + + 1.452057777843 × 10-2r2) if r g 7.4. βueff 22 (r): (-6.06573746 × 10 2.37301667 × 10-3r - 2.26749675 × 10-4r2)/(1 - 0.977349758r + 0.379464477r2 - 7.31724426 × 10-2r3 + 7.00864196 × 10-3r4 2.66836316 × 10-4r5) if 4.2 e r < 6.6, (-2.25683823 × 10-2 + 5.07180253 × 10-3r - 2.80858658 × 10-4r2)/(1 - 0.396040678r + 5.16635999 × 10-2r2 - 2.22724862 × 10-3r3) if r g 6.6. βueff 12 (r): (5.89917898 - 5.36887503r + 1.94121683r2 - 0.348268420r3 + 3.09732873 × 10-2r4 - 1.09130330 × 10-3r5)/(1 - 0.947247624r + 0.367051572r2 - 7.22406134 × 10-2r3 + 7.15653598 × 10-3r4 - 2.82341789 × 10-4r5) if 4.2 e r < 7.4, (4.02329443 × 10-3 - 8.16794927 × 10-4r + 4.10434659 × 10-5r2)/(1 - 0.475204766r + 8.42909217 × 10-2r2 - 6.61069294 × 10-3r3 + 1.93412285 × 10-4r4) if r g 7.4. (15) Rathore, N.; de Pablo, J. J. J. Chem. Phys. 2002, 116, 7225. (16) Metropolis, N.; Rosenbluth, A. W.; Rosenbluth, M. N.; Teller, A. H.; Teller, E. J. Chem. Phys. 1953, 21, 1087. (17) Li, Z.; Scheraga, H. A. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 6611. (18) Wang, F.; Landau, D. P. Phys. ReV. Lett. 2001, 86, 2050.

Two-Component Polypeptides (19) McQuarrie, D. A. Statistical Mechanics; Harper and Row: New York, 1975. (20) Kabsch, W.; Sander, C. Biopolymers 1983, 22, 2577. (21) Onuchic, J. N.; Wolynes, P. G.; Luthey-Schulten, Z.; Socci, N. D. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 3626. (22) Karplus, M.; Ichiye, T.; Pettitt, B. M. Biophys. J. 1987, 52, 1083.

J. Phys. Chem. B, Vol. 110, No. 48, 2006 24733 (23) Rathore, N.; Yan, Q.; de Pablo, J. J. J. Chem. Phys. 2004, 120, 5781. (24) Cieplak, M.; Hoang, T. X.; Robbins, M. O. Phys. ReV. E 2004, 69, 011912. (25) Bagatella-Flores, N.; Gonza´lez-Mozuelos, P. J. Chem. Phys. 2003, 117, 6133.