Heteropolymer Design and Folding of Arbitrary Topologies Reveals an

Oct 16, 2018 - ... the alphabets to 20 letters tends to suppress knots, a finding that points to a new hypothesis to explain the rarity of knots in pr...
0 downloads 0 Views 2MB Size
Article Cite This: Macromolecules XXXX, XXX, XXX−XXX

pubs.acs.org/Macromolecules

Heteropolymer Design and Folding of Arbitrary Topologies Reveals an Unexpected Role of Alphabet Size on the Knot Population Chiara Cardelli,†,§ Luca Tubiana,†,§ Valentino Bianco,†,§ Francesca Nerattini,†,§ Christoph Dellago,†,§ and Ivan Coluzza*,‡,§ †

Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria CIC biomaGUNE, Paseo Miramon 182, 20014 San Sebastian, Spain § IKERBASQUE, Basque Foundation for Science, Maria Diaz de Haro 3, 48013 Bilbao, Spain Macromolecules Downloaded from pubs.acs.org by UNIV OF TEXAS AT EL PASO on 10/17/18. For personal use only.



S Supporting Information *

ABSTRACT: Obtaining complex topological micro- and nanomaterials in a controlled way is an open challenge for material science and chemistry. Recent experimental and computational studies have demonstrated the feasibility of self-assembling knots with up to eight crossings starting from small, identical building blocks. In this work, we investigate computationally a different pathway for knot production. By performing extensive computer simulations of hetero patchy polymers with different patch geometries, we show that it is possible to obtain both torus and twist knots, up to knots with more than 12 crossings. Our results indicate that with patchy polymers it is possible to exploit the bending rigidity of the backbone, the specific geometry of the patches, and the alphabet size to control the spectra of knots of the polymer. In particular, we find that increasing the alphabets to 20 letters tends to suppress knots, a finding that points to a new hypothesis to explain the rarity of knots in proteins. Finally, we demonstrate the ability to fold specific knotted conformations with high precision by designing the heteropolymer sequence. These include both diffuse and highly localized knots as well as two topologies which have not yet been synthesized by selfassembly: the 52 and 10124 knots.



coordination and helically shaped ligands.39,40 Nonetheless, only four types of knots can be produced as of today: the trefoil,41 Savoy or figure-eight knot,42 pentafoil,43,44 and 819 knots.45 Computational investigations46−49 show that selfassembly of helical templates, mimicking the helical ligands used in topological synthetic chemistry, reproduce a mixture of different knots including more complex ones than the ones mentioned above, although only in small traces. A restricted number of those self-assembled knots have been found to kinetically access complex topologies such as the 1012449 (also observed in dipolar hard spheres20) and the 15n41185 knot. However, to date, the synthesis of a specific knot with controllable topology and precise structure is still a challenging task. A possible strategy to progress in this direction is to imitate protein folding, employing heteropolymers. This can be achieved in principle by introducing so-called patchy polymers,33,50,51 chains composed by different monomers characterized by the presence of patches on their surface, which guarantee a directional interaction akin to hydrogen

INTRODUCTION Knots have been used since ancient times to perform a variety of tasks: tying objects or animals, securing ropes, sewing, etc. Advancements in mathematics, physics, biology, and chemistry1 in the past half-century have made clear that knots not only are macroscopic objects but also can arise on polymers and molecules as well, in a variety of ways. Those include spontaneous entanglement,2−5 spatial confinement,6−10 the action of electric fields on DNA,11,12 or of topoisomerases.13,14 On proteins, knots appear as specific folds.15−19 Knots have been reported even in clusters of dipolar spheres20 and in the disclination lines formed by colloidal dispersions in cholesteric liquid crystals.21−23 Molecular knots, much like their macroscopic counterparts, change the physical properties of the chains on which they are tied24−28 and can serve specific functions, for example, enhancing mechanical and thermodynamical stability in knotted proteins15,17,29−32 or polymers.33 There is, in fact, growing interest in synthesizing specific artificial knots34−36 which could be used for example as reusable catalysts for different chemical reactions due to their stability37 or to construct drug delivery carriers.38 In the past 30 years, topological synthetic chemistry has taken considerable steps toward producing knotted molecules using metal © XXXX American Chemical Society

Received: June 27, 2018 Revised: September 10, 2018

A

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules

Figure 1. (a) Schematic representation of the interactions defining the patchy-polymer model. Monomers are self-avoiding spheres of diameter σ interacting both through (b) sigmoidal isotropic interactions, Ubb, whose strength and sign depend on the monomer types, and (c) directional interactions between the patches, Upp. The details of the model are reported in the Supporting Information. In (a), monomers interacting via Ubb with the green one are painted in blue, with shades according to the interaction strength;gray monomers do not interact with it. (d) The two different types of patchy bonded polymer (PBP) considered in this study. PBPs backbones are formed by connecting two of the available patches. This constraints the positions of the remaining ones, since the relative positions of the patches are always fixed. The PBP on top has thus M = 1 patches and the one below M = 2. (e) In constrast, center bonded polymers (CBP) are formed by connecting monomers centers, and their patches are free to rotate while keeping their relative positions fixed. Here we consider the case M = 1 (above) and M = 3 (below). (f, g) Two example configurations of PBP and CBP folded configurations, respectively. Note that PBPs backbones tend to be less flexible to satisfy their patches’ interactions.

complexity up to 20 crossings. Interestingly, increasing the alphabet size tends to simplify the knot spectrum. We design the sequence to obtain precisely folded specific knot topologies, including the twist knot 52 and complex pretzel knot 10124, both of which have not been synthesized.

bonding in proteins. Monomer pairs interact not only via this short-range directional interaction but also through an alphabet of different isotropic attractions/repulsions. The key parameters of the models are the alphabet size of the isotropic interactions q and the number of patches M (which is the same for all the monomers in a chain). Computational studies have shown that patchy polymers can fold precisely into a variety of three-dimensional structures which are dictated by the heteropolymer sequence along the chain.33,50,51 Furthermore, the presence of knotted folds has been reported in a previous study,33 where the authors show that a knotted structure externally locked by controlling the interaction between the end monomers is stabilized for temperature unfolding. Here we consider two standard heteropolymer models where heterogeneous isotropically interacting monomers are bonded along the chain via a harmonic potential with two different anchoring geometries: (i) in the center bonded polymer (CBP) model the spring binds the centers of the monomers; (ii) in the patch bonded polymer (PBP) model the spring connects two anchoring pointstwo patcheson the monomer surface and opposite to each othera (see Figure 1d,e). Possible experimental realizations of the PBP chains could be surface grafted colloidal particles or covalently bonded chemical units. Instead, constructing a CBP chain would need patchy particles where the patches are free to rotate with respect to the central bead. Examples of such are the solid colloids with surface-mobile DNA linkers,52 the lock and key colloids,53 or emulsion droplets with mobile DNA patches.54 We study the space of possible knots in chains of 50 monomers of both the PBP model with M = 1, 2 patches per bead and the CBP model with M = 1, 3, each with two alphabet sizes q = 3 and q = 20. We show that the knot spectrum is controlled by the number of patches and the alphabet size and presents a broad variety of knots of



METHODS Patchy Polymer Models. Patchy polymers are chains of self-avoiding spheres each decorated with spots interacting through a directional potential (see Figure 1). Moreover, the spheres have different species that give them an heterogeneous spectrum of isotropic pair interactions. To each species correspond a different monomer of an alphabet of size q, giving a q × q symmetric matrix of possible values (see Figure 1b and ϵij in eq S2). The interaction matrices are generated according to a Gaussian distribution with null average and standard deviation equal to the ones typically used to model real amino acids.55−60 Within the Gaussian distribution, the choice for the interaction matrix is rather arbitrary as does not change the qualitative behavior of the designability.33,50,51,61 We chose Gaussian interaction matrices with the same standard deviation δB2 as the one found for the amino acids derived with the quasi-chemical approximation by Miyazawa and Jernigan.55−60,62 The standard deviation of the matrix is connected to the glass transition δB2 temperature of heteropolymers Tg 2 = 2s where s is the conformational entropy per bond (eq 25 in ref 63). Hence, by keeping the same standard deviation, we restrict the dependence of the glass transition temperature to the polymer structural properties alone. In this work we consider two different alphabets with size q = 3 and q = 20; the interaction matrices are reported in the Supporting Information. We considered in particular two matrices with a balanced number of self-attractive and self-repulsive types, since these are the B

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules

which minimizes the free energy while maintaining a high heterogeneity (see the Supporting Information in ref 33). This exploration is performed using a VMPT scheme with five parallel simulations, each composed of 16 replicas running at temperatures [0.1, 0.3, 0.5, 0.7, 0.9, 1.2, 1.5, 2, 2.5, 3, 4, 6, 7, 8, 9, 10]. All simulations are performed with a bias potential on the energy of a given sequence and its heterogeneity, defined as in ref 33. FOLD simulations are complementary to DESIGN ones. In this case, we fix the polymer sequence and explore the conformational space of the chain through a combination of crankshaft, pivot, and monomer displacement moves. We consider 10 parallel simulations each running on 32 different replicas at temperatures 3.0, 2.5, 2.2, 2.0, 1.9, 1.7, 1.6, 1.55, 1.5, 1.45, 1.4, 1.35, 1.3, 1.2, 1.15, 1.1, 1.05, 1.025, 1.0, 0.99, 0.975, 0.95, 0.925, 0.9, 0.875, 0.85, 0.825, 0.8, 0.75, 0.7, 0.65, and 0.6. The free energy (see Figure S5 in the Supporting Information) and bias potential are both projected on the total number of contacts between the patches P (distance below 0.625σ and angles θli and θkj > 0.8π) and the DRMSD from the target structure, defined as

most representative. In fact, with Gaussian distributed matrices, the probability of having only self-interactions of q

one kind is low ( 1 ). The chosen q sizes corresponds to the 2 smallest and largest alphabets that from our previous work51 allowed for design and folding. Except for the CBP with one patch geometry, in all the other cases the patches are fixed on the vertices of a platonic solid on the surfaces of the bead i.e., at distance σ/2 from the bead center. From now on we will indicate with M the number of free patches, excluding the anchoring points. In the PBP with M = 1 the patch is placed at 90° with respect to the two anchoring points (in black). In the PBP with M = 2 the two patches are placed on a tetrahedron completed by the anchoring points. In the CBP M = 3 they are placed on the vertices of an equilateral triangle. For each case, we consider two alphabet sizes q = 3 and q = 20 (see the Methods section for the definition of the interactions). Although we have shown in a previous publication that it is possible to design small knots with q = 2,33 the designability of the HP alphabet is limited even for lattice proteins.57 In the CBP model, the patches can rotate solidly on the bead. In the PBP model, only the free patches can rotate along the axis defined by the two patches involved in the bonding interaction between subsequent beads. To reduce the computational costs, we did not include in our study the CBP M = 2 scenario because we did not have a previous study of the designability of such a case. The model used in this work was developed following a reductionist approach. A crucial condition that the model must satisfy is that it is designable; in other words, it must be possible to identify a sequence that folds into a given target structure. In our previous study51 we have shown that designability requires a minimum of one directional interaction per particle, implying that the most straightforward designable self-avoiding chain is a bead−spring model of hard spheres decorated by at least one directional interaction. This feature of our model is an innovative element compared to the studies performed in the past on the knot spectrum of off-lattice polymers.24−28 Monte Carlo Simulations. We employ the methodology of the Monte Carlo simulations SEEK, DESIGN, and FOLD (SDF), which was proven to be able to discriminate between designable and nondesignable structures.50,51 Briefly, the SDF method (i) performs an extensive sampling of the heteropolymer conformations and sequences to chose a target structure, (ii) designs the sequence that should optimally fold into the target structure, and (iii) tests whether the designed sequence correctly folds into the target structure. The sampling capability of each of these steps is enhanced by using the Virtual Move Parallel Tempering with multiple replicas at different temperatures, coupled to a bias potential to escape local free energy minima.64 The bias potential helps the simulations to correctly sample the whole space and is then removed. For each patch geometry, we perform 10 independent SEEK simulations, run in subsequent simulation blocks of 109 MC steps. We repeat the iterations until we observe that the conformational free energy landscape does not show an appreciable difference from the previous run and among the 10 independent simulations. In a DESIGN simulation, we consider a fixed target structure and explore its sequence space through point mutations and monomer type swaps, with the aim to find an optimal sequence

DRMSD =

1 N

∑ (R ij − R ijT)2 (1)

ij

where Rij is again the distance between beads i and j while RTij is the same distance calculated over the target structure. For an example of a FOLD free energy landscape see Figure S5. Finally, SEEK simulations combine the exploration of both conformational and sequence space at the same time. For each polymer model, we perform 10 independent simulations, each of which running a VMPT scheme with the same 32 temperatures used for FOLD. From SEEK, we obtain free energy profiles mapped on two different pairs of global variables (see Figures S6 and S7). The first pair is given by the total number Q of contacts between the spheres calculated as the total isotropic energy ∑i,jUbb(Rij) with all ϵij = 1 and the total number P of contacts between the patches. For this pair of global variables we save a histogram of 850 Q by 100 P bins, and for each bin we store the minimumenergy conformation of each replica. The second pair is the average crossing number (ACN) and the end-to-end distance (Ree) of the polymer, for which we consider a histogram of size 100 by 200 bins. The average crossing number is defined as 1 2π

N−1 N−1

∑∑ i=1 j=1

(ri + 1 − ri) × (rj + 1 − rj) ·(ri − rj) |ri − rj|3

(2)

where the ri+1 − ri approximates the vector tangent to the polymer backbone at the position of the monomer i. The ACN is a continuous variableas needed for the bias potential that corresponds to the average number of intersections of the “polymer shadow”, meaning the average of the number of intersections over all projections of the polymer structure onto all possible planes, i.e., the planes orthogonal to all the directions passing from the center of the polymer. The ACN correlates with the number of crossings of the knot, with higher values of ACN corresponding to more complex knots.65 SEEK simulations are performed with a bias potential W(ACN,P). SEEK simulations are intended to provide possible targets for DESIGN and FOLDING simulations. These target conformations are given by the minimum-energy conformaC

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules

at the same time. For each simulation, we project the free energy on the ACN × Ree plane storing its values in a twodimensional histogram. To each bin, we associate a population of up to 320 low-energy conformations, each of which is sampled independently by one of the parallel replicas in the simulation. This strategy allows us to identify the most probable knot type for each pair of ACN and Ree values, guiding the design of knots with controlled geometrical properties. In Figure 2 we show the results of this analysis for the PBP model with M = 1, since it presents the most varied knot

tions obtained for each bin in the histograms. We perform 10 independent simulations each with 32 replicas that exchange temperatures (not conformations); thus, the conformation with minimum energy will be a low-temperature conformation for each replica. Hence, we have up to 320 independent conformations at low temperature per bin. Through SEEK we can obtain the ensemble averages of several observables. To do so, for a given variable a, we have to weight its value for each conformation by the unbiased probability P of said conformation within its bin in the histogram (ACN, Ree): Pk =

e − βE k n ∑i =b 1 e−βEi

(3)

where nb is the number of conformations for each bin b (up to 320). Thus, for a variable a, its ensemble average is n

⟨a⟩ =

n

∑b = 1 (∑i =b 1 aiPi)e−βFb(ACN, R ee) n

∑b = 1 e−βFb(ACN, R ee)

(4)

where n is the total number of bins and Fb(ACN, Ree) is the unbiased canonical free energy of the bin b. Knot Analysis. To assign a topological state to the conformations of linear patchy polymers, we must first of all circularize them. We do so by using the Minimally Interfering Closure.66 The topology of each circularized chains is then identified by calculating the Dowker code of its projection and comparing it against a table including all knots up to 16 crossings. To do so, we used the KNOTFIND routine included in KNOTSCAPE.67 To calculate the probability of a given knot topology, for instance the 31, we perform a SEEK ensemble average as described in eq 3, taking the variable a to be l o1 if topology is 31 ai = δτi ,31 ≡ o m o o n 0 otherwise

Figure 2. Topological “phase diagram” of the most probable knot type for each couple of ACN and Ree for the PBP model with 1 patch and with q = 3 and q = 20.

(5)

spectrum. The corresponding results for CBP are qualitatively equivalent and are reported in Figures S1−S4. Here and in the following, we denote knot types using the Alexander−Briggs notation,68 commonly used for knots with up to 10 crossings. In this notation, each knot type is indicated in the form XY, where X stands for the number of crossings and Y distinguishes the knot from others with the same number of crossings. The knot spectra in Figure 2 show that knot complexity correlates with the ACN, both for alphabet 3 and alphabet 20, leading to the formation of several knot domains. In both cases, the dominant knot for a certain value of ACN has a comparable number of essential crossings, so trefoil knots dominate the spectrum for ACN ≃ 3, 51 knots for ACN ≃ 5, etc. Nonetheless, some knots show a great spread in ACN, indicating a vaster set of accessible conformations corresponding to them, while others are confined to a narrow set of ACN values. It is important to notice that since Figure 2 only reports the most probable topology for each bin, the actual spread of each knot is greater with respect to the reported domains. Interestingly, there is also a vertical spread with some knots, like the 52 for PBP M = 1, q = 3 being more likely found when the two ends of the chain are close to each other. Starting from the knot spectrum of each bin of the ACN × Ree histogram, we obtain the total knot spectrum of each model by weighting the probability of each knot in a bin by the canonical probability of the bin, as explained in detail in the

where the Kronecker delta is evaluated over the topology τ of conformation i.



RESULTS The two models of patchy polymers considered here, patch bonded polymer (PBP) and center bonded polymer (CBP), idealize several possible experimental realizations of patchy systems, at a scale ranging from molecular to colloidal polymers. The demonstrated ability of patchy polymers to fold into designed three-dimensional structures51 opens the possibility to use them to experimentally obtain a variety of different knots in a reliable way. We consider different possible realizations of CBP and PBP. In particular, we study CBP with M = 1 and M = 3 patches per bead and PBP with M = 1 and M = 2. For both PBP and CBP we consider one alphabet of size q = 3 and one of size q = 20, giving us four different cases per model. To characterize what knots can be obtained with each system, we start by analyzing the knot spectrum of PBP and CBP polymers with N = 50 beads, correlating the probability to obtain each knot with two different geometrical characteristics of the polymer conformations: the average crossing number (ACN) and the end-to-end distance (Ree). To do so, we perform a series of SEEK simulations (see the Methods section) sampling both the conformational and sequence space D

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules Methods section. The knot probability histograms so obtained are reported in Figures 3a and 3b for the PBP and CBP model,

Figure 4. Normalized distribution of the patch contacts for PBP as a function of their chain separation s in monomers units.

Figure 3. Normalized total probabilities of each knot type (a) for the PBP model with M = 1, 2 patches and q = 3, 20 and (b) for the CBP model with M = 1, 3 patches and q = 3, 20. The error bars are represented as ± the standard deviation, calculated selecting 10 different random sets of half of the conformations in each bin. Figure 5. Normalized distribution of the patch contacts for CBP as a function of their chain separation s in monomers units.

respectively. These knot spectra allow us to characterize the effect of the backbone (PBP vs CBP), number of patches, and alphabet size on the probability of observing different knots. First of all, we note that in all the cases considered here torus knots are the most abundant. In particular, 31, 51, 819, and 10124 are dominating topologies. Interestingly the same topologies dominate the spectrum of knots which can be formed by selfassembly of helicates40,43,45−48 and by low-temperature phases of bead-and-stick attractive homopolymers.69 However, patchy polymers can be used to tie twist knots as well, like the 52 knot that is present in all cases in traces ( 35. Another observation supporting our results comes from a previous study by Wüst et al.,74 in which the authors showed that the sequence of a two letters HP model could be used to design lattice heteropolymers to adopt either prevalently F

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules

negligible probability of having patch−patch contacts at low values of s for PBP M = 1. Overall, patchy polymers combine the advantages of directional interactions to shape the conformational space toward regions rich in complex knots, with the control offered by heteropolymers on the specific target structure. Designing and Folding Specific Knots. Having analyzed the most abundant knot types in each model and the effect of the number of patches and alphabet sizes on their probability, we now move on to demonstrate that different topologies can be designed and folded reliably both in PBP and CBP. Furthermore, we demonstrate that this applies even to rare knots such as the 52 knot in the PBP model. We further note here that the simplicity of the knot spectra for q = 20 (Figure 3) does not mean that rare or complex knots, like those obtained for q = 3, cannot be successfully designed and folded. In fact, not only it would be sufficient to use a limited set of letters to obtain the knots included in the q = 3 spectrum, but there was no difference in sequence energy for all the different topologies we consider in this section. This suggests that both rare and common knots are equally designable for q = 20. To demonstrate the designability of different knots, we select a set of representative topologies from our various models. These are 31, 51, and 52 for PBP M = 1; 51 for PBP M = 2; and 31, 819, 10124 for CBP M = 3. Whenever possible, we design conformations in which the ends are close to each other, so that they can be externally connected in a second step to trap the knot. Our design and folding scheme works as follows. For a given topology, we identify the corresponding global free energy minimum in the ACN, Ree projection. This is given by the structure with the highest number of sequences that fold into it, making it the most designable one.83 This solution is not necessarily unique, as other structures further from the global minimum might still provide designable candidates. Among the conformations in the vicinity of the free energy minimum in the ACN, Ree plane, we identify the one with the lowest free energy that also has values of P and Q in the vicinity of the free energy minimum of the corresponding P, Q SEEK spectrum. Following these criteria, we choose the best conformation for each knot type and use it as input for a DESIGN50 simulation (see the Methods section), in which we explore the space of sequences while maintaining the structure frozen. DESIGN simulations provide us with an optimal chain sequence for a given target structure. To test the validity of the designed sequence, we verify with the FOLDING simulation that the target conformation is the global energy minimum in the conformational space of the polymer with the fixed designed sequence. If the global free energy minimum is close to the target structure the polymer is correctly folded, following the criterion in ref 51 (see Figure S5). Following this scheme, we can design and fold all target topologies, with a remarkable precision in the range of fractions of the monomer size, comparable to the folding precision that proteins can reach. In Figure 8 we show the folded structure in comparison with the target structure, together with the root-mean-square displacement from the target structure (RMSD)often used to categorized protein foldsdefined as

knotted or prevalently unknotted conformations. In particular, they found that sequences containing long contiguous sequences of monomers of the same type folded into highly knotted structures, since they forced the chain to form large loops. On the other hand, sequences which contained only short repeating stretches were almost always unknotted.74 An interesting conclusion of the work of Wüst et al. was that the low number of knotted natural protein structures could be produced by an active evolutionary pressure over the sequence of the protein to avoid long repetitions that favors knots. From our results instead, we can draw a different conclusion: the increase in the alphabet size alone spontaneously drives the system to reduce the complexity of the knots or eliminate them altogether. Hence, the 20 amino acids alphabet of proteins has the additional property of reducing the spontaneous occurrence of knots without losing the possibility of optimizing complex topologies, a hypothesis that has not been proposed before. We notice here that while several knotted protein folds have been reported, no knots has been found in RNAs uploaded to the PDB.75 This might at first seem to be at odd with our claim, since RNA has a four-letter alphabet. We note though that not only the reason for the absence of knots in natural RNAs might be evolutionary, several studies have pointed out that RNA tends to fold hierarchically,76,77 forming very stable local branches before long-range contacts can be established. As a result, RNA folds in the PDB appear not to be fully equilibrated.78,79 Finally, the difference between PBP, where enlarging the alphabet favors the trefoil knot, and CBP, where it favors trivial knots, likely originates from the bending rigidity of the polymer. In fact, recent studies on the knotting probability of homopolymers have shown that while completely flexible chains favor unknotted conformations, a slight increment of the persistence length strongly enhances the probability of obtaining simple knots like the trefoil.69,80,81 To confirm that this is the case also in our setup, we computed the persistence length of pure self-avoiding center bonded polymers and patch bonded polymers, in the absence of any attractions between the beads. The results in Figure 7 show that while the CBP model is completely flexible, the PBP has a small persistence length of about three monomers. This also further explains the

Figure 7. Bond−bond correlation function G(s) for pure self-avoiding versions of the PBP and CBP models, in which all interactions aside for backbone bonds and steric repulsion were turned off. s correspond to the distance in bond units. Continuous lines correspond to exponential fits exp(−s/lp), from which we obtain the persistence length lp.

RMSD = G

1 N

∑ δi 2 i

(6) DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules

valuables to produce knotted supramolecular constructs. In this perspective, we characterized the knot spectrum of both chain models, for different patch numbers and two different alphabet sizes of the heterogeneous interactions. We show two elements that contribute to the suppression of complex knots: increasing the number of patches and increasing the size of the alphabet. The contribution of the alphabet is the more interesting one. It is due to the fact that simple knots have larger conformational basins which remain unaccessible with an alphabet of only three letters. Furthermore, such a reduced alphabet produces long repetitions in the sequences, like AAAABBBBCCCCAAAAA, while with 20 letters the sequences are more alternating of the sort AJIOUIOTNRNCPJISJNC. These repetitions, combined with an alphabet with equally probable attractive and repulsive interactions, promote the formations of loops and thus knots. On the other hand, a large alphabet with q = 20 can access most if not all the conformations compatible with each topology, thus giving a knot spectrum which is similar to the one of the underlying polymer backbone. Our results differ significantly from the one of Wüst et al.74 because we show that the final knot spectrum is a result of the conformational and sequence entropy and does not require an evolutionary pressure to remove the complex topologies from the spectrum. The latter finding has intriguing implications in the field of knotted proteins where the low percentage of knotted structures recorded so far suggested that it might be the result of explicit selection pressure.1,29,32,86−89 Here, instead, we propose that simply the alphabet of 20 letters is a strong deterrent for the formation of knots. Finally, we take several knots, both toroidal and twist, some of which have never been experimentally tied, like the 52 and 10124, and we design them by optimizing the sequence of heterogeneous interactions. As we demonstrate in the Results section, the designed sequences were able to fold back into the target knotted structure, suggesting that our approach might indeed open the pathway for the experimental realization of complex supramolecular knots. We can select and fold also structures with close ends, which could be easily locked externally to guarantee the thermal stability of the structure.33 Moreover, the structures fold back with a precision of fractions of the bead size, comparable to what proteins can achieve and necessary to open possible applications as catalyzers or drug delivery carriers.

Figure 8. Designed knotted conformations. Each pair of structures represents the target (left) and folded (right) conformations. The part of the polymer involved in the knot is represented in orange, the end monomers in red, and the rest in green (for the target) and cyan (for the folded). For each pair, the value of RMSD (root-mean-square displacement between the folded and the target) is reported. The knotted portions have been identified using Kymoknot.82 The diagrammatic representation of all designed knots are reported on the right. The 52 (colored in red) is a twist knot, while the other are torus knots.

Here, δi is the distance between the bead i of the folded structure and the corresponding one of the target structure. We note in particular that we can fold knots up to five crossings in conformations with close ends which could be easily locked externally. A similar result should be achievable for more complex knots like the 819 and 10124 by simply using longer chains. Finally, our procedure demonstrates that the PBP model can be used to design and fold highly localized knots, as shown by the 31 knot in Figure 8, which takes up only 14 beads, less than one-third the length of the chain. It is important to stress that we have identified knots for which we could identify sequences that at equilibrium prefer the knotted structure. Starting from our result, it would interesting to study the kinetic aspects of the folding process. In fact, our design scheme can be extended to reproduce the one proposed by Fink and Ball,84 which optimizes not only the target configurations but also intermediate states needed to accelerate the folding kinetics. We expect that the kinetic accessibility will further restrict the knots spectrum in favor of simpler knots, although it might still possible to design and fold more complex topologies. We will address this point in a future study based on a recent implementation of our patchypolymer models in Espresso.85



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.macromol.8b01359.



Knots spectra and interaction matrices used (PDF)

AUTHOR INFORMATION

Corresponding Author



*E-mail: [email protected] (I.C.). ORCID

CONCLUSIONS In this work, we studied the topological properties of two different models of hetero patchy polymers, namely the center bonded polymer and patch bonded polymer. Both models offer several viable experimental realizations and might prove highly

Christoph Dellago: 0000-0001-9166-6235 Ivan Coluzza: 0000-0001-7728-6033 Notes

The authors declare no competing financial interest. H

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules



(16) Wagner, J. R.; Brunzelle, J. S.; Forest, K. T.; Vierstra, R. D. A light-sensing knot revealed by the structure of the chromophorebinding domain of phytochrome. Nature 2005, 438, 325. (17) Boutz, D. R.; Cascio, D.; Whitelegge, J.; Perry, L. J.; Yeates, T. O. Discovery of a thermophilic protein complex stabilized by topologically interlinked chains. J. Mol. Biol. 2007, 368, 1332−1344. (18) Taylor, W. R. A deeply knotted protein structure and how it might fold. Nature 2000, 406, 916−919. (19) Bölinger, D.; Sułkowska, J. I.; Hsu, H.-P.; Mirny, L. A.; Kardar, M.; Onuchic, J. N.; Virnau, P. A Stevedore’s protein knot. PLoS Comput. Biol. 2010, 6, e1000731. (20) Miller, M. A.; Wales, D. J. Novel structural motifs in clusters of dipolar spheres: Knots, links, and coils. J. Phys. Chem. B 2005, 109, 23109−23112. (21) Tkalec, U.; Ravnik, M.; Copar, S.; Zumer, S.; Musevic, I. Reconfigurable Knots and Links in Chiral Nematic Colloids. Science 2011, 333, 62−65. (22) Seč, D.; Č opar, S.; Ž umer, S. Topological zoo of free-standing knots in confined chiral nematic fluids. Nat. Commun. 2014, 5, 3057. (23) Machon, T.; Alexander, G. P. Knotted Defects in Nematic Liquid Crystals. Phys. Rev. Lett. 2014, 113, 027801. (24) Plunkett, P.; Piatek, M.; Dobay, A.; Kern, J. C.; Millett, K. C.; Stasiak, A.; Rawdon, E. J. Total curvature and total torsion of knotted polymers. Macromolecules 2007, 40, 3860−3867. (25) Rawdon, E. J.; Kern, J. C.; Piatek, M.; Plunkett, P.; Stasiak, A.; Millett, K. C. Effect of knotting on the shape of polymers. Macromolecules 2008, 41, 8281−8287. (26) Narros, A.; Moreno, A. J.; Likos, C. N. Effects of knots on ring polymers in solvents of varying quality. Macromolecules 2013, 46, 3654−3668. (27) Poier, P.; Likos, C. N.; Matthews, R. Influence of rigidity and knot complexity on the knotting of confined polymers. Macromolecules 2014, 47, 3394−3400. (28) Dai, L.; Renner, C. B.; Doyle, P. S. Metastable tight knots in semiflexible chains. Macromolecules 2014, 47, 6135−6140. (29) Sułkowska, J. I.; Sułkowski, P.; Szymczak, P.; Cieplak, M. Stabilizing effect of knots on proteins. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 19714−19719. (30) Clark, R. J.; Jensen, J.; Nevin, S. T.; Callaghan, B. P.; Adams, D. J.; Craik, D. J. The engineering of an orally active conotoxin for the treatment of neuropathic pain. Angew. Chem., Int. Ed. 2010, 49, 6545− 6548. (31) Sayre, T. C.; Lee, T. M.; King, N. P.; Yeates, T. O. Protein stabilization in a highly knotted protein polymer. Protein Eng., Des. Sel. 2011, 24, 627−630. (32) Soler, M. A.; Nunes, A.; Faisca, P. F. N. Effects of knot type in the folding of topologically complex lattice proteins. J. Chem. Phys. 2014, 141, 07B607_1. (33) Coluzza, I.; van Oostrum, P. D. J.; Capone, B.; Reimhult, E.; Dellago, C. Sequence controlled self-knotting colloidal patchy polymers. Phys. Rev. Lett. 2013, 110, 075501. (34) Nakata, M.; Nakamura, Y.; Maki, Y.; Sasaki, N. Slow expansion of a single polymer chain from the Knotted globule. Macromolecules 2004, 37, 4917−4921. (35) Gao, Y.; Zhou, D.; Zhao, T.; Wei, X.; McMahon, S.; O’Keeffe Ahern, J.; Wang, W.; Greiser, U.; Rodriguez, B. J.; Wang, W. Intramolecular Cyclization Dominating Homopolymerization of Multivinyl Monomers toward Single-Chain Cyclized/Knotted Polymeric Nanoparticles. Macromolecules 2015, 48, 6882−6889. (36) Cao, P. F.; Rong, L. H.; Mangadlao, J. D.; Advincula, R. C. Synthesizing a Trefoil Knotted Block Copolymer via Ring-Expansion Strategy. Macromolecules 2017, 50, 1473−1481. (37) Marcos, V.; Stephens, A. J.; Jaramillo-Garcia, J.; Nussbaumer, A. L.; Woltering, S. L.; Valero, A.; Lemonnier, J.-F.; Vitorica-Yrezabal, I. J.; Leigh, D. A. Allosteric initiation and regulation of catalysis with a molecular knot. Science 2016, 352, 1555−1559. (38) Newland, B.; Zheng, Y.; Jin, Y.; Abu-Rub, M.; Cao, H.; Wang, W.; Pandit, A. Single cyclized molecule versus single branched

ACKNOWLEDGMENTS All simulations presented in this paper were performed on the Vienna Scientific Cluster (VSC). We acknowledge support from the Austrian Science Fund (FWF) project 26253-N27. L.T. also acknowledges support from the Mahlke-Oberman Stiftung and the European Union’s Seventh Framework Programme for research, technological development, and demonstration (Grant 609431). V.B. acknowledges also the support from the FWF Grant M 2150-N36.



ADDITIONAL NOTES In our previous publication,51 CBP and PBP where called freely rotating chains (FRC) and freely jointed chains (FJC), respectively. b Also for q = 3, the sequence energies for the designed knots optimized for each knot topology for the structure with the highest probability of having that knot topologyare similar for different knot complexities. This does not imply that with q = 3 it will be possible to optimize the large number of existing unknotted or simply knotted conformations, as many of them will be not optimized due to the presence of the homopolymeric stretches. a



REFERENCES

(1) Coluzza, I.; Jackson, S. E.; Micheletti, C.; Miller, M. A. Knots in soft condensed matter. J. Phys.: Condens. Matter 2015, 27, 350301. (2) Frank-Kamenetskii, M. D.; Lukashin, A. V.; Vologodskii, A. V. Statistical mechanics and topology of polymer chains. Nature 1975, 258, 398. (3) Sumners, D.; Whittington, S. G. Knots in self-avoiding walks. J. Phys. A: Math. Gen. 1988, 21, 1689. (4) Pippenger, N. Knots in random walks. Discrete Applied Mathematics 1989, 25, 273−278. (5) Micheletti, C.; Marenduzzo, D.; Orlandini, E. Polymers with spatial or topological constraints: Theoretical and computational results. Phys. Rep. 2011, 504, 1−73. (6) Arsuaga, J.; Vázquez, M.; Trigueros, S.; Sumners, D. W.; Roca, J. Knotting probability of DNA molecules confined in restricted volumes: DNA knotting in phage capsids. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 5373−5377. (7) Marenduzzo, D.; Orlandini, E.; Stasiak, A.; Sumners, D.; Tubiana, L.; Micheletti, C. DNA-DNA interactions in bacteriophage capsids are responsible for the observed DNA knotting. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 22269−22274. (8) Micheletti, C.; Orlandini, E. Numerical study of linear and circular model DNA chains confined in a slit: metric and topological properties. Macromolecules 2012, 45, 2113−2121. (9) Dai, L.; van der Maarel, J. R.; Doyle, P. S. Effect of nanoslit confinement on the knotting probability of circular. ACS Macro Lett. 2012, 1, 732−736. (10) Micheletti, C.; Orlandini, E. Knotting and unknotting dynamics of DNA strands in nanochannels. ACS Macro Lett. 2014, 3, 876−880. (11) Tang, J.; Du, N.; Doyle, P. S. Compression and selfentanglement of single DNA molecules under uniform electric field. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 16153−16158. (12) Klotz, A. R.; Narsimhan, V.; Soh, B. W.; Doyle, P. S. Dynamics of DNA Knots during Chain Relaxation. Macromolecules 2017, 50, 4074−4082. (13) Wigley, D. B. Structure and mechanism of DNA topoisomerases. Annu. Rev. Biophys. Biomol. Struct. 1995, 24, 185−208. (14) Dean, F. B.; Stasiak, A.; Koller, T.; Cozzarelli, N. R. Duplex DNA knots produced by Escherichia coli topoisomerase I. Structure and requirements for formation. J. Biol. Chem. 1985, 260, 4975−4983. (15) Virnau, P.; Mirny, L. A.; Kardar, M. Intricate knots in proteins: Function and evolution. PLoS Comput. Biol. 2006, 2, e122. I

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules molecule: a simple and efficient 3D “knot” polymer structure for nonviral gene delivery. J. Am. Chem. Soc. 2012, 134, 4782−4789. (39) Erbas-Cakmak, S.; Fielden, S. D. P.; Karaca, U.; Leigh, D. A.; McTernan, C. T.; Tetlow, D. J.; Wilson, M. R. Rotary and linear molecular motors driven by pulses of a chemical fuel. Science 2017, 358, 340−343. (40) Fielden, S. D. P.; Leigh, D. A.; Woltering, S. L. Molecular Knots. Angew. Chem., Int. Ed. 2017, 56, 11166−11194. (41) Dietrich-Buchecker, C. O.; Sauvage, J. P.; Kern, J. M. Templated synthesis of interlocked macrocyclic ligands: The catenands. J. Am. Chem. Soc. 1984, 106, 3043−3045. (42) Ponnuswamy, N.; Cougnon, F. B. L.; Pantos, G. D.; Sanders, J. K. M. Homochiral and meso figure eight knots and a Solomon link. J. Am. Chem. Soc. 2014, 136, 8243−8251. (43) Ayme, J.-F.; Beves, J. E.; Leigh, D. A.; McBurney, R. T.; Rissanen, K.; Schultz, D. A synthetic molecular pentafoil knot. Nat. Chem. 2012, 4, 15−20. (44) Leigh, D. A.; Pritchard, R. G.; Stephens, A. J. A Star of David catenane. Nat. Chem. 2014, 6, 978. (45) Danon, J. J.; Krüger, A.; Leigh, D. A.; Lemonnier, J.-F.; Stephens, A. J.; Vitorica-Yrezabal, I. J.; Woltering, S. L. Braiding a molecular knot with eight crossings. Science 2017, 355, 159−162. (46) Orlandini, E.; Polles, G.; Marenduzzo, D.; Micheletti, C. Selfassembly of knots and links. J. Stat. Mech.: Theory Exp. 2017, 2017, 034003. (47) Polles, G.; Marenduzzo, D.; Orlandini, E.; Micheletti, C. Selfassembling knots of controlled topology by designing the geometry of patchy templates. Nat. Commun. 2015, 6, 6423. (48) Polles, G.; Orlandini, E.; Micheletti, C. Optimal Self-Assembly of Linked Constructs and Catenanes via Spatial Confinement. ACS Macro Lett. 2016, 5, 931−935. (49) Marenda, M.; Orlandini, E.; Micheletti, C. Discovering privileged topologies of molecular knots with self-assembling models. Nat. Commun. 2018, 9, 3051. (50) Coluzza, I.; van Oostrum, P. D. J.; Capone, B.; Reimhult, E.; Dellago, C. Design and folding of colloidal patchy polymers. Soft Matter 2013, 9, 938. (51) Cardelli, C.; Bianco, V.; Rovigatti, L.; Nerattini, F.; Tubiana, L.; Dellago, C.; Coluzza, I. The role of directional interactions in the designability of generalized heteropolymers. Sci. Rep. 2017, 7, 4986. (52) Van Der Meulen, S. A.; Leunissen, M. E. Solid colloids with surface-mobile DNA linkers. J. Am. Chem. Soc. 2013, 135, 15129. (53) Sacanna, S.; Irvine, W. T. M.; Chaikin, P. M.; Pine, D. J. Lock and key colloids. Nature 2010, 464, 575−578. (54) Feng, L.; Pontani, L.-L.; Dreyfus, R.; Chaikin, P.; Brujic, J. Specificity, flexibility and valence of DNA bonds guide emulsion architecture. Soft Matter 2013, 9, 9816−9823. (55) Miyazawa, S.; Jernigan, R. L. Estimation of effective interresidue contact energies from protein crystal-structures - quasichemical approximation. Macromolecules 1985, 18, 534−552. (56) Betancourt, M. R.; Onuchic, J. N. Kinetics of proteinlike models: The energy landscape factors that determine folding. J. Chem. Phys. 1995, 103, 773−787. (57) Dill, K. A.; Bromberg, S.; Yue, K. Z.; Ftebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S. Principles of Protein-Folding - a Perspective From Simple Exact M Odels. Protein Sci. 1995, 4, 561− 602. (58) Vendruscolo, M. Modified configurational bias Monte Carlo method for simulation of polymer systems. J. Chem. Phys. 1997, 106, 2970−2976. (59) Seno, F.; Trovato, A.; Banavar, J. R.; Maritan, A. Maximum entropy approach for deducing amino acid interactions in proteins. Phys. Rev. Lett. 2008, 100, 1−4. (60) Coluzza, I. Transferable Coarse-Grained Potential for De Novo Protein Folding and Design. PLoS One 2014, 9, e112852. (61) Coluzza, I.; Dellago, C. The configurational space of colloidal patchy polymers with heterogeneous sequences. J. Phys.: Condens. Matter 2012, 24, 284111.

(62) Miyazawa, S.; Jernigan, R. L. Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 1996, 256, 623−644. (63) Pande, V. S.; Grosberg, A. Y.; Tanaka, T. Statistical mechanics of simple models of protein folding and design. Biophys. J. 1997, 73, 3192−3210. (64) Coluzza, I.; Frenkel, D. Virtual-move parallel tempering. ChemPhysChem 2005, 6, 1779−1783. (65) Diao, Y.; Dobay, A.; Kusner, R. B.; Millett, K.; Stasiak, A. The average crossing number of equilateral random polygons. J. Phys. A: Math. Gen. 2003, 36, 11561−11574. (66) Tubiana, L.; Orlandini, E.; Micheletti, C. Probing the Entanglement and Locating Knots in Ring Polymers: A Comparative Study of Different Arc Closure Schemes. Prog. Theor. Phys. Suppl. 2011, 191, 192−204. (67) Hoste, J.; Thistlethwaite, M. Knotscape: http://www.math.utk. ̃ edu/morwen/knotscape.html. (68) Alexander, J. W.; Briggs, G. B. On Types of Knotted Curves. Annals of Mathematics 1926, 28, 562. (69) Marenz, M.; Janke, W. Knots as a topological order parameter for semiflexible polymers. Phys. Rev. Lett. 2016, 116, 128301. (70) Lua, R. C.; Grosberg, A. Y. Statistics of knots, geometry of conformations, and evolution of proteins. PLoS Comput. Biol. 2006, 2, e45. (71) Baiesi, M.; Orlandini, E.; Stella, A. L. The entropic cost to tie a knot. J. Stat. Mech.: Theory Exp. 2010, 2010, P06012. (72) Baiesi, M.; Orlandini, E.; Seno, F.; Trovato, A. Energetic frustration: an evolutionary strategy to avoid kinetic traps in entangled proteins 2018, 28−30. (73) Cossio, P.; Trovato, A.; Pietrucci, F.; Seno, F.; Maritan, A.; Laio, A. Exploring the universe of protein structures beyond the protein data bank. PLoS Comput. Biol. 2010, 6, e1000957. (74) Wüst, T.; Reith, D.; Virnau, P. Sequence Determines Degree of Knottedness in a Coarse-Grained Protein Model. Phys. Rev. Lett. 2015, 114, 028102. (75) Micheletti, C.; Di Stefano, M.; Orland, H. Absence of knots in known RNA structures. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 2052−2057. (76) Tinoco, I., Jr.; Bustamante, C. How RNA folds. J. Mol. Biol. 1999, 293, 271−281. (77) Greenleaf, W. J.; Frieda, K. L.; Foster, D. A. N.; Woodside, M. T.; Block, S. M. Direct observation of hierarchical folding in single riboswitch aptamers. Science 2008, 319, 630−633. (78) Repsilber, D.; Wiese, S.; Rachen, M.; Schröder, A. W.; Riesner, D.; Steger, G. Formation of metastable RNA structures by sequential folding during transcription: time-resolved structural analysis of potato spindle tuber viroid (−)-stranded RNA by temperaturegradient gel electrophoresis. RNA 1999, 5, 574−584. (79) Liu, L.; Hyeon, C. Contact Statistics Highlight Distinct Organizing Principles of Proteins and RNA. Biophys. J. 2016, 110, 2320−2327. (80) Coronel, L.; Orlandini, E.; Micheletti, C. Non-monotonic knotting probability and knot length of semiflexible rings: the competing roles of entropy and bending energy. Soft Matter 2017, 13, 4260−4267. (81) Poier, P.; Likos, C. N.; Matthews, R. Influence of rigidity and knot complexity on the knotting of confined polymers. Macromolecules 2014, 47, 3394−3400. (82) Tubiana, L.; Orlandini, E.; Micheletti, C. luca-tubiana/ KymoKnot: Kymoknot initial release. 2018; https://doi.org/10. 5281/zenodo.1239859. (83) Helling, R.; Li, H.; Mélin, R.; Miller, J.; Wingreen, N.; Zeng, C.; Tang, C. The designability of protein structures. J. Mol. Graphics Modell. 2001, 19, 157−167. (84) Fink, T. M.; Ball, R. C. How many conformations can a protein remember? Phys. Rev. Lett. 2001, 87, 198103. (85) Arnold, A.; Lenz, O.; Kesselheim, S.; Weeber, R.; Fahrenberger, F.; Roehm, D.; Košovan, P.; Holm, C. Meshfree Methods for Partial Differential Equations VI; Springer: 2013; pp 1−23. J

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX

Article

Macromolecules (86) Mallam, A. L.; Jackson, S. E. Folding studies on a knotted protein. J. Mol. Biol. 2005, 346, 1409−1421. (87) Mallam, A. L.; Jackson, S. E. Knot formation in newly translated proteins is spontaneous and accelerated by chaperonins. Nat. Chem. Biol. 2012, 8, 147. (88) Lim, N. C. H.; Jackson, S. E. Mechanistic insights into the folding of knotted proteins in vitro and in vivo. J. Mol. Biol. 2015, 427, 248−258. (89) Najafi, S.; Potestio, R. Folding of small knotted proteins: Insights from a mean field coarse-grained model. J. Chem. Phys. 2015, 143, 243121.

K

DOI: 10.1021/acs.macromol.8b01359 Macromolecules XXXX, XXX, XXX−XXX