Hydrophilic

tease apart water-mediated driving forces for assembly, such as hydrophobic ... (iii) Repaint the monomers furthest from the collapsed globule cen...
4 downloads 0 Views 1MB Size
14043

2009, 113, 14043–14046 Published on Web 10/02/2009

Tuning the Globular Assembly of Hydrophobic/Hydrophilic Heteropolymer Sequences Henry S. Ashbaugh* Department of Chemical and Biomolecular Engineering, Tulane UniVersity, New Orleans, Louisiana 70118 ReceiVed: July 31, 2009; ReVised Manuscript ReceiVed: September 24, 2009

We propose a heteropolymer design scheme to tune monomer distributions that stabilize or destabilize the collapsed globular conformation relative to random sequencing. Polymer sequences trained via globular templating are mapped to a one-dimensional Ising-like model, and inverse Monte Carlo simulations are performed to determine an effective interaction between monomers that reproduces intrasequence correlations. Heteropolymer sequences generated using this effective interaction quantitatively reproduce the coil-to-globule transition with increasing polymer hydrophobicity observed for templated sequences. Through potential scaling, the range of transition hydrophobic fractions required to collapse the polymer opens up by a factor of 2, from a minimum fraction of 17% to a maximum of 32% for the longest polymers simulated. Collapsed conformations are favored by sequences in which there is intermediate segregation of hydrophobic and hydrophilic units along the backbone, while monomer integration favors coils. Linear DNA sequences are continually transcribed into selfassembled three-dimensional protein structures in ViVo. The reverse engineering of how sequence dictates structure is a preeminent research challenge. Most successful theoretical efforts toward predicting folded structures from amino acid sequences have focused on knowledge-based interactions derived from native protein structure databases, rather than on physics based force fields.1 Molecularly detailed simulation models have more frequently been employed to tease apart water-mediated driving forces for assembly, such as hydrophobic interactions,2 or analyze the formation of characteristic protein structural elements, like alpha helices3 and beta hairpins,4 though more ambitious computationally intensive calculations have been advanced.5,6 While protein structure is determined by permutations of the 20 amino acids, the primary distinction between amino acid side chains are thought to depend on their hydrophobic and hydrophilic characteristics.7 This proposition is supported by de noVo protein design experiments showing that the binary pattern of hydrophobic and hydrophilic amino acids plays a significant role in secondary8 and tertiary9 structure determination. Following this paradigm, binary and abbreviated monomer alphabets have been widely exploited in simulation studies of the thermodynamics and stability of folded lattice proteins10,11 and random heteropolymer globules.12,13 Khokhlov and Khalatur (KK) presented a physically motivated method to design hydrophobic/hydrophilic heteropolymers in which collapsed globular conformations are “programmed” into the monomer sequence, imparting protein-like stability to the globule.14 The simulation steps of this design protocol, referred to here as KK templating, are as follows: (i) Perform simulations of a purely hydrophilic, swollen polymer coil. (ii) Sample snapshots from the coil simulations (i) and “paint” each monomer hydrophobic to collapse the coil into a globule, mimicking a temperature drop or change in solvent quality. (iii) * E-mail: [email protected].

10.1021/jp907398r CCC: $40.75

Repaint the monomers furthest from the collapsed globule center of mass hydrophilic until the desired fraction of hydrophobic monomers and sequence is obtained. Genzer and co-workers15 have experimentally realized KK templating through the preferential bromination of polystyrene globule surfaces in a poor solvent. Alternatively, nonrandom monomer distributions have been templated onto polysoaps through polymerization of surfactant microstructures, imparting enhanced self-assembled properties.16 A drawback of KK templating is that either the sequence is templated or it is random, with little control in between. Here, we describe a sequence design method in which templated polymer sequences are mapped to a one-dimensional Ising-like model and used to train a sequence generating interaction which reproduces correlations along the polymer backbone. Sequences generated by this effective interaction quantitatively reproduce the conformational behavior of the KK templated sequences. Moreover, when this sequence generating interaction is modified by a scaling factor, the critical hydrophobicity at which the polymers collapse into a globule can be continuously increased or decreased relative to random sequences. An alternate approach proposed by Khalatur and co-workers17,18 optimizes copolymer sequences through “evolutionary pressure” to optimize the melting temperature of a given sequence. Our design strategy described below in principle can be trained by any sequence design method,19 not just KK templating, allowing us to explore connections between differing design strategies. Molecular dynamics simulations of coarse-grained hydrophobic/ hydrophilic heteropolymers at infinite dilution in an implicit solvent were performed. We briefly describe the model used here, while a detailed description is given in ref 13. Nonbonded monomer interactions were modeled as φ(r) ) φrep(r) + λφatt(r), where φrep(r) is the repulsive Weeks-Chandler-Andersen interaction20 and φatt(r) is the remaining Lennard-Jones (LJ) attractive interaction. The parameter λ assumes the value 0, 1, and 2.5 for hydrophilic/hydrophilic, hydrophilic/hydrophobic,  2009 American Chemical Society

14044

J. Phys. Chem. B, Vol. 113, No. 43, 2009

Letters Simulation results for the radii of gyration of random and KK templated 100-mers as a function of 〈H〉 are presented in Figure 1. Generally, hydrophilic polymers (〈H〉 f 0) exhibit swollen coil conformations, which undergo a sigmoidal collapse transition to a globule with increasing 〈H〉. Smaller values of 〈H〉 are required to affect this coil-to-globule collapse for the KK templated sequences. To quantify the efficiency for which different sequence design strategies stabilize the globular state, the collapse transition hydrophobicity is obtained by fitting the simulation results to

〈Rg2〉1/2 ) (Rglob〈H〉 + βglob)f(Λ-1(〈H〉 - 〈H〉*)) + Figure 1. Root mean square radii of gyration of 100-mer heteropolymers as a function of the fractional hydrophobicity. The green filled diamonds indicate simulation results for the purely random (γ ) 0) polymers, while the red filled circles indicate simulation results for sequences generated following the KK templating scheme. The black open circles and blue filled triangles indicate simulation results for polymer sequences generated from the sequence potential with γ ) 1 and -1, respectively. The blue, green, and red lines indicate fits of eq 1 to the simulation results for γ ) -1, 0, and 1, respectively. The main figure focuses on the transition region up to 〈H〉 ) 0.6, while the inset figure shows the radii of gyration for the random sequences over the entire range of 〈H〉. The transition hydrophobicities for the γ ) -1, 0, 1, and KK templated sequences were determined to be 〈H〉* ) 0.325(3), 0.298(3), 0.241(3), and 0.244(3), respectively.

and hydrophobic/hydrophobic monomer interactions, respectively. Bonded interactions were modeled as φbond(r) ) (800ε/ σ2)(r - σ)2, where σ and ε are the LJ diameter and well depth, respectively. Simulations of a single polymer of 100, 200, or 300 monomers were conducted at a temperature of 2 ε/k, where k is Boltzmann’s constant. The fraction of hydrophobic monomers 〈H〉, defined as the ratio of the number of hydrophobic monomers to the total number of monomers, was varied from 0 to 1. To mollify specific sequence effects on polymer properties, at least 50 different monomer sequences were simulated for each value of 〈H〉. Each sequence was equilibrated for 106 time steps (δt ) 0.002(mσ2/ε)1/2) followed by 5 × 106 steps for average evaluation.

(Rcoil〈H〉 + βcoil)[1 - f(Λ-1(〈H〉 - 〈H〉*))] (1) which we previously demonstrated provides an accurate description of the polymer collapse. In this expression, the radii of gyration toward either extreme of purely hydrophilic or hydrophobic polymers are presumed to be linear with 〈H〉 (Figure 1, inset), with slopes and intercepts given by Rcoil/glob and βcoil/glob. The transition between these two regiemes is captured by the Fermi function, f(x) ) 1/[1 + exp(x)], where the transition zone width is Λ and the transition hydrophobicity is 〈H〉*. Fits of eq 1 to our simulation results are excellent (e.g., Figure 1). In the case of the 100-mer, 〈H〉* is 0.298(3) and 0.244(3) for the random and KK templated sequences, respectively. Thus, approximately 20% fewer hydrophobic monomers are required for the templated sequences to stabilize the globular conformation of the 100-mer. To expand the window over which the globular state can be stabilized or destabilized relative to random sequencing, we consider the individual templated sequences as realizations of a one-dimensional Ising model with the hydrophobic and hydrophilic monomers assigned opposing spins. Examining the correlation of hydrophobic monomers along the polymer length and that between hydrophobic and hydrophilic monomers, we ask: What effective interactions between monomers reproduce the observed correlations? Figure 2 shows the normalized correlations of hydrophobic and hydrophilic monomers for the KK templated 100-mers at

Figure 2. Sequence correlation functions characterizing the distribution of hydrophobic and hydrophilic monomers along the backbone of 100-mer heteropolymers. (a) Distribution of hydrophobic monomers at position i relative to either end of the polymer chain. This plot is symmetric about i ) 50.5. (b) Cross hydrophobic/hydrophilic monomer correlation between monomers at positions i and j along the polymer backbone. The solid blue curves indicate correlations averaged over 2500 KK templated sequences at a sequence hydrophobicity of 〈H〉 ) 0.24, the closest to the critical hydrophobicity 〈H〉* ) 0.244(3) with an integer number of hydrophobic monomers. The error bars indicate one standard deviation in the correlations averaged over all 2500 sequences. The dashed red curves indicate one-dimensional spin sequence simulation averages with 〈H〉 ) 0.24 using the 100-mer sequence potentials given in Figure 3. While hydrophobic/hydrophobic and hydrophilic/hydrophilic self-correlations can be defined, the information contained within those correlations is redundant and does not add to that given in a and b.

Letters

J. Phys. Chem. B, Vol. 113, No. 43, 2009 14045 to segregate. The depletion observed as |i - j| f 99 is not a result of long-range interactions but sequence end correlation effects (i.e., Figure 2a). Effective monomer interactions that reproduce the target correlations (e.g., Figure 2) are obtained using inverse Monte Carlo (IMC) simulations. During IMC, a spin sequence at fixed 〈H〉 is simulated with trial interactions that are periodically updated according to21

Φ*update(i) ) Φ*trial(i) + f ln[gtrial(i)/gtarget(i)]

Figure 3. Intrasequence generating potentials as determined by inverse Monte Carlo inversion of the sequence correlation functions as shown in Figure 2. (a) Hydrophobic monomer interaction with either end of the polymer backbone. (b) Cross hydrophobic-hydrophilic monomer interaction. In a and b, the blue solid, red long-dashed, and green shortdashed lines correspond to results for 100, 200, and 300-mers. (c) Sample sequences for a 100-mer with 〈H〉 ) 0.5 at values of γ from -3 to 3. The light blue and red stripes indicate hydrophobic and hydrophilic monomers, respectively.

〈H〉 ) 0.24, the hydrophobicity closest to 〈H〉* realizable with an integer number of hydrophobic monomers. These correlations were obtained by performing an additional 2500 templating simulations to improve the statisitical significance. For random sequences, the correlations are unity. Hydrophobic monomers are depleted from either end of the sequence (Figure 2a), indicating the polymer ends partition toward the globule surface. The hydrophobic/hydrophilic cross correlation (Figure 2b) indicates these monomers are repulsive to one another and tend

(2)

and repeated until convergence. Here, Φ*(i) is the dimensionless x effective interaction, gx(i) is the target correlation (e.g., Figure 2) or correlation corresponding to the trial interaction, and f ) 0.25 to ensure stable convergence. The interactions obtained were found to be insensitive to the choice of initial trial interactions. While these interactions in principle depend on 〈H〉, we only perform IMC at 〈H〉*. We argue 〈H〉* provides a balance between the coil and globule states, noting the sequence correlations vanish at the hydrophilic (〈H〉 f 0) and hydrophobic (〈H〉 f 1) extremes. While the target correlations employed here were derived by KK templating, the effective interactions can potentially be trained to reproduce sequence correlations of alternate globular sequence patterning methods. The converged sequence generating interactions obtained by IMC are shown in Figure 3. Contrary to standard Ising model implementations, the intramonomer interactions extend beyond nearest neighbors to approximately five monomers away. This interaction range is necessary to reproduce the sequence correlations obtained by KK templating (Figure 2). These interactions are in near quantitative agreement with one another, with the best agreement between interactions derived for the 200- and 300-mer sequences. More importantly, when the interactions given in Figure 3 are used to generate sequences across the range of 〈H〉, the polymer radii of gyration and 〈H〉* values obtained are nearly indistinguishable from those obtained by templating (e.g., Figure 1). In addition to being faster than KK templating, an advantage of using the sequence generating interactions is that they can be continuously adjusted by multiplication by a scaling factor, γ, to tune the globular stability (i.e., Φ*tune(i) ) γΦ*(i)). When

Figure 4. Coil-to-globule stability plots. (a) Transition hydrophobicities as a function of the sequence generating interaction scaling factor γ. The blue triangles, green diamonds, and red circles indicate results for 100, 200, and 300-mer heteropolymers, respectively. The error bars indicate one standard deviation. The lines are only guides for the eye. For sequences with hydrophobicities above these curves, the polymers are expected to be in a globular conformation, while below these curves they are expected to be in coiled conformations. (b) Transition hydrophobicities as a function of γ and fractional sequence charge; 〈q〉 is defined as the ratio absolute value of the polymer charge in units of e divided by the total number of monomers. The solid blue lines correspond to the present work shown in a, while the dashed red lines correspond to simulation results for random heteropolyelectrolytes reported in ref 13. Hydrophobic fractions above these curves correspond to globular conformations, while hydrophobic fractions below are coil-like.

14046

J. Phys. Chem. B, Vol. 113, No. 43, 2009

γ is 0 or 1, the sequences obtained are random or templated. Below, we show increasing γ above 1 can enhance globular stability, while decreasing γ below 0 destabilizes the globule. Sample 100-mer sequences generated at 〈H〉 ) 0.50 for γ from -3 to 3 are shown in Figure 3c. Increasing γ tends to enhance the segregation of hydrophobic and hydrophilic monomers (ferromagnetic-like), while the monomers become more integrated with decreasing γ, resulting in repeating hydrophobic/ hydrophilic sequences (antiferromagnetic-like). The transition hydrophobicities as a function of γ determined from polymer simulations of the 100-, 200-, and 300-mer heteropolymers are plotted in Figure 4a. For 〈H〉 values below these curves, the polymers are coiled, while above, they are globular. The stabilization curves are qualitatively the same for each sequence length, while the 200- and 300-mer curves are in near quantitative agreement. Beginning from γ ) 0, 〈H〉* decreases with increasing γ, stabilizing the globular state. Globule stabilization with increasing γ is not monotonic, however, but reaches a maximum stabilization at γ values of 2.5 for the 100-mer, and 1-1.5 for the 200- and 300-mers. Indeed, it is impossible for 〈H〉* to drop to zero, since this corresponds to a chain with no hydrophobic units. For the 200and 300-mers, the minimum in 〈H〉* lies close to γ ) 1 so that globular stabilization is not significantly better than that obtained by KK templating, suggesting that templating provides a nearly optimal stabilization with the minimal fraction of hydrophobic monomers. The increase in 〈H〉* with further increases in γ is a result of the increasing segregation of the two domains swelling the hydrophilic coils on the globule surface to form a tadpole-like conformation17 and thereby shifting 〈H〉*. Decreasing γ below zero increases 〈H〉*, destabilizing the globular state. The destabilization of the globule quickly plateaus for γ values a little less than -0.5. This plateau results from the diminishing effect of monomer integration. The net result of tuning γ is that the range of 〈H〉* opens up by a factor of 2, from a minimum of 0.17 to a maximum of 0.32 for the longest sequences studied. Previously, we explored the relationship between charge and hydrophobicity on the conformational stability of random heteropolymer sequences, observing a near chain length independent correlation between 〈H〉* and the fractional cationic charge of polymer, 〈q〉, for globule destabilization with increasing charge.13 The techniques developed here for generating nonrandom sequences provide an orthogonal route to charge/ hydropathy for tuning polymer conformational stability. Plotting 〈H〉* against both 〈q〉 and γ lays out the skeleton of a larger globule stability surface for heteropolymer sequences (Figure 4b). In summary, a new polymer sequence design strategy has been developed to stabilize globules by mapping sequences generated by KK templating to a one-dimensional Ising model. An effective sequence generating interaction is inferred from

Letters intrasequence correlations using inverse Monte Carlo simulations. Heteropolymers generated from this sequence generating interaction exhibit a coil-to-globule transition with increasing polymer hydrophobicity in quantitative agreement with that observed using KK templating. Tuning the sequence generating interaction by a scaling factor, the range of 〈H〉* opens by a factor of 2. Our observations agree with the protein folding paradigm that the three-dimensional structure is encoded in its one-dimensional sequence. Ongoing theoretical studies in mapping out the determinants of heteropolymer conformational stability should provide physical insights into the relationships between empirically derived predictors of natively unfolded protein sequences, like sequence complexity22 and charge/ hydropathy.23 Acknowledgment. This work was supported by a National Science Foundation CAREER Award (Grant No. CBET0746955). References and Notes (1) Skolnick, J. Curr. Opin. Struct. Biol. 2006, 16, 166–171. (2) Athawale, M. V.; Goel, G.; Ghosh, T.; Truskett, T. M.; Garde, S. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 733–738. (3) Paschek, D.; Gnanakaran, S.; Garcı´a, A. E. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6765–6770. (4) Zhou, R.; Berne, B. J.; Germain, R. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 14931–14936. (5) Duan, Y.; Kollman, P. A. Science 1998, 282, 740–744. (6) Garcı´a, A. E.; Onuchic, J. N. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 13898–13903. (7) Dill, K. A. Biochemistry 1990, 29, 7133–7155. (8) West, M. W.; Hecht, M. H. Protein Sci. 1995, 4, 2032–2039. (9) Kamtekar, S.; Schiffer, J. M.; Xiong, H.; Babik, J. M.; Hecht, M. H. Science 1993, 262, 1680–1685. (10) Lau, K. F.; Dill, K. A. Macromolecules 1989, 22, 3986–3997. (11) Sali, A.; Shakhnovich, E.; Karplus, M. Nature 1994, 369, 248– 251. (12) Dill, K. A. Biochemistry 1985, 24, 1501–1509. (13) Ashbaugh, H. S.; Hatch, H. W. J. Am. Chem. Soc. 2008, 130, 9536– 9542. (14) Khokhlov, A. R.; Khalatur, P. G. Phys. ReV. Lett. 1999, 82, 3456– 3459. (15) Semler, J. S.; Jhon, Y. K.; Tonelli, A.; Beevers, M.; Krishnamoorti, R.; Genzer, J. AdV. Mater. 2007, 19, 2877–2883. (16) Summers, M.; Eastoe, J. AdV. Colloid Interface Sci. 2003, 100102, 137–152. (17) Khalatur, P. G.; Novikov, V. V.; Khokhlov, A. R. Phys. ReV. E 2003, 67, 051901. (18) Khalatur, P. G.; Khokhlov, A. R.; Krotova, M. K. Macromol. Symp. 2007, 252, 36–46. (19) Khalatur, P. G.; Khokhlov, A. R. AdV. Polym. Sci. 2006, 195, 1– 100. (20) Chandler, D.; Weeks, J. D.; Andersen, H. C. Science 1983, 220, 787–794. (21) Soper, A. K. Chem. Phys. 1996, 202, 295–306. (22) Romero, P.; Obrdovic, Z.; Li, X.; Garner, E. C.; Brown, C. J.; Dunker, A. K. Proteins: Struct., Funct., Genet. 2001, 42, 38–48. (23) Uversky, V. N.; Gillespie, J. R.; Fink, A. L. Proteins: Struct., Funct., Genet. 2000, 41, 415–427.

JP907398R