Effects of Concentration and Temperature on DNA Hybridization by

Jul 22, 2016 - The framework developed in this work can be applied to the future design of DNA nanostructures in both fields of structural DNA nanotec...
0 downloads 9 Views 2MB Size
This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

Article pubs.acs.org/JPCB

Effects of Concentration and Temperature on DNA Hybridization by Two Closely Related Sequences via Large-Scale Coarse-Grained Simulations Cade B. Markegard, Cameron P. Gallivan, Darrell D. Cheng, and Hung D. Nguyen* Department of Chemical Engineering and Materials Science, University of California, Irvine, Irvine, California 92697, United States S Supporting Information *

ABSTRACT: A newly developed coarse-grained model called BioModi is utilized to elucidate the effects of temperature and concentration on DNA hybridization in self-assembly. Largescale simulations demonstrate that complementary strands of either the tetrablock sequence or randomized sequence with equivalent number of cytosine or guanine nucleotides can form completely hybridized double helices. Even though the end states are the same for the two sequences, there exist multiple kinetic pathways that are populated with a wider range of transient aggregates of different sizes in the system of random sequences compared to that of the tetrablock sequence. The ability of these aggregates to undergo the strand displacement mechanism to form only double helices depends upon the temperature and DNA concentration. On one hand, low temperatures and high concentrations drive the formation and enhance stability of large aggregating species. On the other hand, high temperatures destabilize base-pair interactions and large aggregates. There exists an optimal range of moderate temperatures and low concentrations that allow minimization of large aggregate formation and maximization of fully hybridized dimers. Such investigation on structural dynamics of aggregating species by two closely related sequences during the self-assembly process demonstrates the importance of sequence design in avoiding the formation of metastable species. Finally, from kinetic modeling of self-assembly dynamics, the activation energy for the formation of double helices was found to be in agreement with experimental results. The framework developed in this work can be applied to the future design of DNA nanostructures in both fields of structural DNA nanotechnology and dynamic DNA nanotechnology wherein equilibrium end states and nonequilibrium dynamics are equally important requiring investigation in cooperation.

1. INTRODUCTION The field of DNA nanotechnology, which has made significant advances recently, can be traced back to the pioneering work of Seeman and co-workers in the early 1980s. Inspired by the ability of DNA to renature and undergo branch migration as seen in genetics,1,2 Seeman developed an algorithmic method of having DNA strands hybridize into a two-dimensional tile structure similar to the Holliday junction.3 This innovation paved the way for other investigators, with the help of designing tools such as CadNano4 and NUPACK,5 to utilize DNA for developing complex nanostructures such as self-assembled squares, smiley faces, nanotubes, and tetrahedrons.6−9 The common self-assembly mechanism to form these nanostructures is the hybridization process between complementary strands. DNA hybridization and renaturation have been extensively studied since the 1960s by Marmur and Doty,10 providing a fundamental understanding on the effects of sequence and environmental condition on the formation of double helices.11,12 However, more insights from molecular simulations are needed at the molecular level for an understanding of self-assembly kinetics and dynamic nature of DNA nanostructures. This can help improve assembly yields and reduce structural defects not only in the field of structural © XXXX American Chemical Society

DNA nanotechnology that concerns with the equilibrium end states but also in dynamic DNA nanotechnology that concerns with nonequilibrium dynamics.13 The dynamics of nucleic acids have been evaluated by both all-atom14−18 and coarse-grained force fields.19−35 Although allatom force fields give a high amount of detail, it is currently not computationally feasible to simulate the hybridization process of single stranded DNA (ssDNA) into double-stranded DNA (dsDNA). Therefore, various researchers have developed coarse-grained force fields to capture the physics at cruder resolutions. Coarse-grained models vary from many nucleotides being represented by one bead, which was used to investigate how DNA hybridizes when covalently attached to large inorganic particles,36 to a single nucleotide being represented by three beads, such as in the works of Dokholyan37 and de Pablo.19,33,38 In this work, we utilize BioModi, our newly developed coarse-grained simulation package for studying self-assembly processes of nucleic acids (Figure 1),39 peptides, and Received: April 18, 2016 Revised: July 15, 2016

A

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

interactions and forming new ones to further hybridize the strands together into a double helix. During slithering, another strand can get attached to the partial dimer forming a trimer; in this case, the formation of a double helix is a product of a transition between metastable trimer into a more energetically favorable dimer through the strand displacement mechanism. Although we have explained the differences between these hybridization mechanisms in our earlier work,39 we only focused on a single temperature and DNA concentration for four different sequences. In this work, we elucidate the full effect of those factors on the formation of double helices by two closely related sequences with the same number of C or G nucleotides in large-scale simulations. We also perform kinetic modeling of the whole self-assembly process to extract the kinetic rate constants associated with the three abovementioned hybridization mechanisms. Previous works by the Doye45 and de Pablo33,46 groups have both used forward-flux sampling to obtain the kinetic rate constant of the hybridization process of two complementary strands. Although the forwardflux sampling technique is an elegant method for obtaining rate constants, the drawback of these works is not obtaining the rate constants with more than two chains in the simulation. For our work, we directly simulate the assembly of 100 strands (50 sense and 50 antisense) at different conditions; thus we are able to explore kinetic pathways not accessible in the aforementioned works.

Figure 1. BioModi representation of four nucleotides as large beads superimposing on an atomistic structure: adenine in pink, thymine in cyan, cytosine in red, and guanine in yellow. The sugar bead (in purple) and phosphate bead (in brown) are positioned at the center of mass of the corresponding the five-atom ring sugar and phosphate group, respectively. The bead representing each base is at the N1 position for purine bases (adenine and guanine) and the N3 position for pyrimidine (thymine and cytosine).

polymers.40−43 Our implementation of the nucleic acid model was built upon the 3-site-per-nucleotide models developed by the Dokholyan37 and de Pablo33,38 groups including the following interactions: base stacking, electrostatic, solventinduced, and directional base pairing between specific pairs (C and G bases or A and T bases). This model was coupled with an event-driven molecular dynamics algorithm44 called discontinuous molecular dynamics (DMD), wherein atoms are interacted with one another via discontinuous potentials instead of continuous potentials as in standard molecular dynamics. BioModi contains enough genuine biological character to mimic real nucleic acids by capturing DNA structural fidelity (minor and major grooves), mechanical properties and thermal behavior.39 Specifically, BioModi can predict quantitatively the persistence length of ssDNA and dsDNA as a function of salt concentration and qualitatively the melting behavior of dsDNA as a function of sequence.39 Yet BioModi is simple enough to be computationally tractable to simulate very large systems. This enables us to probe the fundamental physics underlying DNA hybridization in largescale systems by exploring different kinetic mechanisms under specific solution conditions. We have previously shown that our DNA model can elucidate three kinetic mechanisms related to hybridization during large-scale self-assembly simulations: slithering, zippering, and strand displacement. When two strands initially form native contacts, hybridization is via the zippering mechanism, in which these strands quickly wrap around one another to form base-pair (bp) interactions reaching the fully hybridized state. When two strands initially form non-native contacts, hybridization is via the slithering mechanism, in which both strands keep inching up in opposite directions by breaking existing bp

2. METHODS 2.1. Simulation Methods. Simulations were performed by using the discontinuous molecular dynamics (DMD) simulation algorithm,47 a variant on standard molecular dynamics that is applicable to systems of molecules interacting via discontinuous potentials (e.g., hard-sphere, and square-well potentials). Unlike soft potentials such as the Lennard-Jones potential, discontinuous potentials exert forces only when particles collide, enabling the exact (as opposed to numerical) solution of the collision dynamics. DMD simulations proceed by locating the next collision, advancing the system to that collision, and then calculating the collision dynamics. Simulations are performed in the canonical ensemble with periodic boundary conditions imposed to eliminate artifacts due to box walls. Constant temperature is achieved by implementing the Andersen thermostat method.48 In this case, all beads are subjected to random, infrequent collisions with ghost particles whose velocities are chosen randomly from a Maxwell−Boltzmann distribution centered at the system temperature. The simulation temperature is expressed in terms of the reduced temperature, T* = kbT/εHB, where kb is Boltzmann’s constant, T is the absolute temperature, and εHB is the strength of one hydrogen bond (initially set at 1 kJ/mol as a

Table 1. Two Systems, C8T8A8G8 and RAND32, of Different Sequences Simulated in This Studya system C8T8A8G8: 50 sense strands +50 antisense strands RAND32: 50 sense strands +50 antisense strands noncomplementary: 50 sense 1 strands +50 sense 2 strands

strand

sequence

sense (5′→3′) antisense (3′→5′) sense (5′→3′) antisense (3′→5′) sense 1 (5′→3′) sense 2 (3′→5′)

1-CCCC CCCC TTTT TTTT AAAA AAAA GGGG GGGG-32 32-GGGG GGGG AAAA AAAA TTTT TTTT CCCC CCCC-1 1-CCAA TGCG GTAA GCCT GACA CCGA TCAA TCTT-32 32-GGTT ACGC CATT CGGA CTGT GGCT AGTT AGAA-1 1-CCAA TGCG GTAA GCCT GACA CCGA TCAA TCTT-32 32-ATAG AAGT GGGA ATGG ATAC TACA CGTG GGCC-1

Both sense strand and antisense strand are numbered in the 5′ → 3′ direction from 1 to 32 to show that they are complementary in the opposite direction. In addition, another system called “noncomplementary” is also simulated as a control. a

B

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

TRAN. The code determines which strands are aggregated together using our criterion of having at least one specific basepair (bp) interaction (between C and G bases or A and T bases) without nonspecific interactions between every two neighboring strands. The cutoff distance and angle for considering a bp formed/disrupted are 8 Å and 180 ± 30°, respectively. This ensures that the distance between the hydrogen bond forming bases and their orientation agree with known parameters from crystal structures and atomistic simulations. Once aggregates are identified, additional information about their properties such as the time they were created (i.e., aggregate creation time), amount of time they exist (i.e., aggregate lifetime), and their degree of hybridization (i.e., number of bp interactions per chain) can be extracted using an in-house Python module. The results of this analysis can be seen in the Results and Discussion.

reference so that other potential strengths can be accordingly scaled). Simulation time is defined in reduced units to be t* = t/σ(kbT/m)1/2, where t is the simulation time, σ is the average bead diameter, and m is the average mass of all beads (i.e., one unit of mass in reduced value) 2.2. Self-Assembly of Many Strands. Self-assembly simulations were performed for two systems, each has a different sequence with 100 strands of ssDNA. The C8T8A8G8 system contains 50 sense strands and 50 antisense strands of the tetrablock sequences; the RAND32 system contains 50 sense strands and 50 antisense strands of the random sequences. In addition, we also performed simulations on another system called noncomplementary as a control; this system contains 50 strands of the same sense sequence from the RAND32 system and 50 strands of another randomized sequence. The sequences of each system contain 50% C or G content, as listed in Table 1. These strands were initially placed randomly in the simulation box as a starting configuration. They were heated at high temperature (T* = 5.0) for a period of time until complete randomization to ensure there is no bias in the beginning of each simulation. Velocities of the particles were subsequently reinitialized to a desired temperature for a production run, which was conducted using the NVT ensemble. Our simulations are aimed to elucidate the effects of sequence, temperature, and DNA concentration on the selfassembly of ssDNA into dsDNA. We evaluated temperatures from T* = 0.7 to T* = 1.3, and DNA concentrations from 0.09 to 1.45 mM. These concentrations are higher than experimental values (typically on the order of nanomolar to micromolar) as exploring experimental concentrations is currently computationally unfeasible. The concentration is calculated as c = N/ (NAV), where N is the number of strands of N = 100, NA is Avogadro’s number, and V is the volume of the simulation box that was varied for each concentration. All simulations were conducted at an extremely high salt concentration, which dictates the persistence length of dsDNA to be 50 nm and ssDNA to be 1.5 nm. A summary of the simulations performed in this study is available in Table 2. For the majority of this study, we

3. RESULTS AND DISCUSSION 3.1. Effect of Sequence on DNA Self-Assembly. The results obtained at the end of the self-assembly process in a crowded environment are similar for the two systems whose sequences are listed in Table 1. Both systems containing 50 sense and 50 antisense strands with the same CG content at 50% were simulated at a moderate temperature of T* = 1.0 and high DNA strand concentration of [DNA] = 1.45 mM. Both systems convert 100% of their single strands into dimers at the end of simulations after 600 time units (in dimensionless value, t*), as shown in Figure 2A,B from 20 simulations for each system. Here a dimer or an aggregate of any size is defined as having two or more strands forming at least one base-pair interaction per strand with a neighboring strand. The propensity of forming a base-pair interaction or contact, whether native or non-native, is remarkably different for the two systems, as shown in Figure 3A. These contact maps in Figure 3A demonstrate that any two complementary strands of the RAND32 system have a relatively high propensity to form non-native contacts at random positions along the length of each strand compared to their ability to form native contacts. In contrast, the propensity of any two complementary strands of the C8T8A8G8 system is localized to confined nucleotide positions where native contacts can also be formed. These differences affect the self-assembly behavior of these two systems as described below. Even though both systems form 100% dimers at equilibrium, there is an ensemble of large aggregates including trimers, tetramers, pentamers, and larger structures during the selfassembly process, as shown in Figure 2A,B. Dimers are constantly formed by two single strands via either the zippering mechanism or slithering mechanism depending on their first point of contact when they approach each other as shown in Pathway 1 of Figure 3B (movies showing zippering and slithering mechanisms were provided in our previous publication.39 If their first contact is native as in the fully hybridized double helix, these strands quickly zip up, forming the most bp interactions via the zippering mechanism. Otherwise, they slowly inch up by breaking existing base-pair interactions and forming new interactions via the slithering mechanism, which takes at least 1 order of magnitude longer than the zippering mechanism.39 In this case when two strands are partially hybridized as they undergo the slithering mechanism, another single strand can make a contact with an unhybridized segment at either end of this partially hybridized dimer, resulting in the formation of a trimer, as shown in

Table 2. Summary of 220 Simulations Performed in This Study system

[DNA] (mM)

temperature (T*)

no. of replicas

C8T8A8G8 RAND32 C8T8A8G8 noncomplementary

1.45 1.45 0.36, 0.09 1.45

0.7, 0.9, 1.0, 1.1, 1.3 0.7, 0.9, 1.0, 1.1, 1.3 0.9, 1.1 1.0

20 20 4 4

performed 20 independent simulations for each condition; however, at low concentrations, we performed only four independent simulations due to the large computational time it takes for strands to come together when they are in dilute environments. We have simulated a total of 220 large-scale simulations, each was conducted for at least 600 time units requiring about a month on only a single processor. This was determined to be enough time for our simulations to reach equilibrium, as indicated by the ensemble averages of the system’s total potential energy; which varied by no more than 2.5% toward the end of each simulation run. 2.3. Analysis Method. Simulations were analyzed using an in-house analysis package developed for BioModi in FORC

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 2. Effect of sequence on self-assembly at T* = 1.0 and [DNA] = 1.45 mM. Percentage of molecules belonging to various aggregate sizes, s (s = 1 for monomers, s = 2 for dimers, s = 3 for trimers, etc.), as a function of time in natural logarithmic scale for (A) C8T8A8G8 and (B) RAND32. The aggregate creation time and lifetime of dimers, trimers, tetramers, pentamers, and larger aggregates (as each dot in blue, red, green, gray, and purple color, respectively) for (C) C8T8A8G8 and (D) RAND32. The aggregate lifetime and the mean number of bp interactions per chain of dimers, trimers, tetramers, pentamers, and larger aggregates for (E) C8T8A8G8 and (F) RAND32.

simulations. In Figure 2C,D, the aggregate creation time and lifetime are plotted for the 20 simulations conducted at each condition. It is evident from these plots that both systems form a set of dimers that were created at the beginning of the simulation (when t* < 200). Most of these dimers are longlived, as shown by larger clusters of blue dots near the diagonal blue line in Figure 2C,D. This diagonal line was artificially added as a guidance to establish that the aggregates that fall along this line are stable and continuously formed over the course of simulation whereas those aggregates that scatter away from this line are unstable and intermittently formed. Indeed, Figure 2C,D indicates that dimers are also formed during the middle and the end of the simulation as some dimers are products of the dissociation of trimers, tetramers, and larger aggregates via the strand displacement mechanism. Short-lived dimers also exist in both systems; however, the RAND32 system creates many more short-lived dimers (i.e., 14.25 ± 3.98 shortlived dimers per simulation) compared to the C8T8A8G8 system (i.e., 5.35 ± 2.35 short-lived dimers per simulation). Moreover, most trimers, tetramers, pentamers, and larger aggregates are also formed during the beginning of the simulation; however, they are short-lived, which indicates that they are transient species. Those transient species exist in higher population throughout the simulation in the RAND32 system (which forms 18.65 ± 3.6 trimers and 9.90 ± 3.62 tetramers per simulation for example) than those in the C8T8A8G8 system (which forms 9.3 ± 3.1 trimers and 3.55 ± 1.53 tetramers per simulation for example), as observed in Figure 2C,D. Next we can evaluate the degree of hybridization by different aggregating species by examining their number of bp interactions per chain at a given time. By normalizing the bp number of a given aggregate to a single chain, we can directly compare the degrees of hybridization by aggregates of varying sizes. Figure 2E,F shows the mean number of bp interactions and the aggregate lifetimes of dimers, trimers, tetramers, pentamers, and larger aggregates. For both C8T8A8G8 and RAND32, the majority of dimers that were formed at the beginning of the simulation (i.e., they are long-lived) have approximately ∼12−16 bp interactions per chain, with 16 bp per chain being the maximum for our 32-mer sequences (i.e.,

Pathway 2 of Figure 3B (Movie S1). If such a contact is native, the incoming strand can undergo the zippering mechanism, which displaces the other strand by breaking its current bp interactions and forming more interactions with the landed strand until full hybridization is reached, creating a double helix, freeing the other strand. If the first contact with the third strand is non-native, this trimer can proceed to grow into a tetramer and possibly a larger aggregate instead of shrinking down to a dimer. Alternatively, three free strands can instantaneously form a trimer, as shown in Pathway 3 of Figure 3B (Movie S2). In this case, one strand forms contacts at one end of a complementary strand while another strand forms contacts at the other end. Both strands undergo hybridization with the complementary strand; however, one strand hybridizes faster the other strand, resulting in displacement and the formation of a fully hybridized double helix. Interestingly, the populations of large aggregates are notably different for the two systems during self-assembly. Strands of the C8T8A8G8 system initially form a set of dimers from the free monomers in solution (Figure 2A). After a short delay, at ln[t* + 1] = 0.7, trimers start forming and tetramers appear at ln[t* + 1] = 1.8. Both trimer and tetramer species reach a maximum around ln[t*+1] = 2.5, and then dissociate back into dimers and monomers. Free monomers in solution then form additional dimers. RAND32, similar to C8T8A8G8, initially forms a set of dimers (Figure 2B); however, trimers begin forming much quicker at ln[t* + 1] = 0.5. Then tetramers start forming at ln[t* + 1] = 1.0, followed by pentamers at ln[t* + 1] = 1.5, and larger aggregates at ln[t* + 1] = 3. As these large aggregates form in a higher population, single strands or monomers for RAND32 are depleted quicker than in the C8T8A8G8 system. All the large aggregates exist past ln[t* + 1] = 5 for RAND32, whereas the trimers and tetramers of C8T8A8G8 are dissociated by ln[t* + 1] = 4.75. Therefore, both systems are able to form 100% dimers but differ in their kinetic pathways to reach there even though they have the same number of G or C nucleotides. The differences in the kinetics of the two systems can be further quantified by analyzing the time at which an aggregate is formed and the amount of time that an aggregate exists in all D

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

number of bp at ∼32 (or ∼16 bp per strand) are long-lived and stable. The reason the RAND32 strands form more transient and metastable aggregates than the C8T8A8G8 strands during selfassembly is due to the large number of possible non-native contacts compared to C8T8A8G8, as shown in Figure 3A. The RAND32 strands are more likely to undergo the strand displacement mechanism via Pathway 2 (Figure 3B) in which the initial contacts made by the first two strands are non-native. As these strands undergo hybridization via the slithering mechanism, their sticky or unhybridized ends form more bp interactions with another strand, creating a trimer and growing into a larger aggregate. In contrast, the C8T8A8G8 strands have a narrower path for possible contacts, therefore reducing the possible kinetic pathways in forming large aggregates. In this case, the C8T8A8G8 strands are more likely to undergo the strand displacement mechanism via Pathway 3 (Figure 3B) in which the formation of a trimer or larger aggregate is made by forming native contacts. However, simultaneous zippering processes take place until the strand that made the most bp interactions wins out and displaces the other strand to form a fully hybridized double helix. Because the slithering mechanism takes at least 1 order of magnitude longer than the zippering mechanism,39 Pathway 2 by the RAND32 strands can form more larger aggregates than can Pathway 3 by the C8T8A8G8 strands. The process of transitioning from a large aggregate to a smaller aggregate via the strand displacement mechanism (by forming a transient and metastable dimer, then a trimer or bigger aggregate before transforming into an energetically favorable dimer or dsDNA) as observed in our large-scale simulations is spontaneous without being aided by any external factors such as increasing the temperature or other stimuli. This indicates that DNA self-assembly from ssDNA into dsDNA at an isothermal condition is an enthalpically driven process undergoing a large entropic cost because the final products are monodisperse dimers rather than a polydisperse ensemble of aggregates of different sizes. Moreover, it demonstrates that DNA self-assembly is a robust process with an extraordinary ability to overcome kinetic barriers and local minima in contrast to other biological molecules such as peptides or proteins. In fact, short peptides such as polyalanines are known to fold into individual helices at dilute concentrations yet undergo selfassembly to form amyloid fibrils when the concentration is slightly increased.49 Furthermore, proteins in the absence of chaperones tend to get trapped in local minima longer and become susceptible of forming oligomer and growing into stable large aggregates especially when they misfold at high concentrations. The formation of large and ordered aggregates such as amyloid fibrils has been implicated in numerous neurodegenerative diseases.50−52 The strand displacement mechanism by DNA13 is currently being used in the field of dynamic DNA nanotechnology in which dynamic nanostructures can be potentially used as nanomachines,53−55 logic gates,56−58 and targeted delivery vehicles.59 This mechanism has been under investigation by experimentalists since the 1970s when the DNA of Escherichia coli 15 was explored under electron microscopy.1 Interestingly, the strand displacement mechanism of DNA has recently been used for computation.60 The strand displacement processes have been studied by Doye, Ouldridge, and co-workers using other coarse-grained packages such as oxDNA; however, those simulations started with just three strands, two of which had

Figure 3. (A) Contact maps or probability of forming a contact by two complementary strands of either the C8T8A8G8 or RAND32 system: non-native contacts in blue, native contacts in purple, and noncomplementary contacts in white. The native contacts should follow a narrow straight line; however, a small tolerance of ±3 is allowed. Both sense strand and antisense strand are numbered in the 5′→ 3′ direction from 1 to 32 as listed in Table 1. (B) Snapshots showing three self-assembly pathways to form a dsDNA starting from two free complementary strands (S and S′): Pathway 1 (P1) involves either zippering or slithering mechanism. Pathway 2 (P2) involves the strand displacement mechanism in which a partially hybridized dimer is formed first by making non-native contacts; then a free strand (gray, S) forms native contacts with the unhybridized segment of the red strand to undergo hybridization via the zippering mechanism and displacing the green strand so that the red and gray strands can form a complete dsDNA. Pathway 3 (P3) involves another strand displacement mechanism in which the gray and green strands instantly form a trimer with the red strand by making native contacts and undergoing the zippering mechanism, but the gray strand hybridizes faster than the green strand, resulting in displacement so that the red and gray strands can form a complete dsDNA.

32 bp for a duplex). However, all trimers, tetramers, pentamers, and larger aggregates have up to only 12 bp interactions per chain and are short-lived, as they are transient species that are involved in the strand displacement mechanism. Therefore, they are considered metastable species compared to the globally stable dimers when they are fully hybridized. Moreover, toward the end of the simulations is formed an ensemble of dimers that contain a low number of bp interactions by both systems; however, RAND32 forms significantly more partially hybridized dimers than C8T8A8G8. It is worthy to mention that even though our criteria of identifying an aggregate seem quite nonstrict for requiring only one bp interaction between every two neighboring strands, the data shown in Figure 2E,F can capture the effect of forming a certain number of bp by a given aggregate on its lifetime. For example, a dimer of any lifetime (having more than 0 time units) tends to form at least ∼4 bp (or ∼2 bp per strand). On one hand, those dimers with just ∼4 bp are unstable as their lifetimes are quite small, indicating that they are short-lived. On the other hand, the dimers having close to the maximum E

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 4. Effect of temperature on C8T8A8G8 self-assembly at [DNA] = 1.45 mM. Percentage of molecules belonging to various aggregate sizes, s (s = 1 for monomers, s = 2 for dimers, s = 3 for trimers, etc.), as a function of time in natural logarithmic scale for (A) T* = 0.7 and (B) T* = 1.3 reduced temperatures. The aggregate creation time and lifetime for temperatures (C) T* = 0.7 and (D) T* = 1.3. The aggregate lifetime and the mean number of bp interactions per chain during the lifetime for temperatures (E) T* = 0.7 and (F) T* = 1.3.

been already attached together as a dimer.61,62 Their simulations focused on the binding of the third strand onto the dimer and the displacement process using the forward flux sampling to calculate the kinetic rate constants.61,62 Differently, our study represents the first investigation to observe the strand displacement mechanism that takes place spontaneously at concentrated conditions in large systems. It should be noted that our model can differentiate the selfassembly behavior of complementary strands seen in Figure 2 from that of the noncomplementary strands, as shown in Figure S1. In the noncomplementary system, dimers and larger aggregates can be formed in each simulation (Figure S1A); however, the population of dimers is significant smaller at ∼40% compared to 100% in any complementary system whereas the populations of trimers and larger aggregates are relatively small. Moreover, dimers are formed throughout each simulation so their lifetime is relatively short (Figure S1B). In contrast, the majority of dimers in complementary systems are formed at the beginning of each simulation so they are longlived. Each strand in most aggregates including dimers in the noncomplementary system can form less than 8 bp interactions (Figure S1C), which is only half of the number of bp interactions observed in the complementary systems. This indicates that completely randomized sequences that are not complementary can still create dimers, which are mostly shortlived, due to the fact that these noncomplementary strands can still form a few native contacts and many non-native contacts (Figure S1D) especially at a relatively high concentration of DNA strands at 1.45 mM. As the DNA concentration decreases, this aggregate population is reduced, as discussed in the section below. 3.2. Effect of Temperature on DNA Self-Assembly. In addition to sequence, the effect of temperature on DNA selfassembly was also investigated by performing multiple simulations over a wide range of temperature from T* = 0.7 to T* = 1.3, which are lower than the melting temperature of T* = 1.4 for both systems. The melting point was determined on the basis of equilibrium data on the fraction of bp interactions by two complementary strands as a function of temperature from replica exchange molecular dynamics

simulations; in this case, the temperature at which this bp interaction fraction reaches 50% is the melting point.39 Using the melting point of T* = 1.4 as a reference, our results are meant for qualitative rather than quantitative comparison with experimental data. The results on the effect of temperature are shown in Figure 4 for C8T8A8G8 at the lowest temperature of T* = 0.7 and the highest temperature of T* = 1.3. At T* = 0.7; as shown in Figure 4A, dimers initially form, and then trimers appear at ln[t* + 1] = 0.5. The rate of formation of the dimers then flattens out, and the monomers and dimers are used to form trimers, tetramers, and larger aggregates. These aggregates maximize in their amounts at ln[t* + 1] = 4 but then dissociate at a slow rate. At the end of the T* = 0.7 simulation, aggregates larger than dimers still exist, unlike those at higher temperatures. For self-assembly simulations conducted at T* = 1.3, as shown in Figure 4B, dimers form very slowly due to the increased kinetic energy in the system. A very small amount of trimers do form, but they constitute less than 5% of the total ssDNA molecules. At the end of the simulation there exist mostly dimers and a few monomers in solution. Aggregate lifetimes and creation times were evaluated for both temperatures in Figure 4C,D. At T* = 0.7, there exists many dimers, trimers, and larger aggregate with a large proportion having lifetimes under t* = 100 reduced units. At T* = 1.3, there are long-lived dimers, but only a few short-lived dimers and trimers. Trimers are created throughout the simulation but they survive for very short lifetimes. Due to the availability of monomers throughout the course of simulation, these free strands can attach themselves to dimers to create trimers. Because of the high kinetic energy, these trimers are quite unstable and thus transient, resulting the formation of dimers throughout the simulation. To evaluate the degree of hybridization by different aggregating species, one can obtain the mean number of bp interactions as a function of the aggregate liftetime, as seen in Figure 4E,F. At low temperatures such as T* = 0.7, tetramers, pentamers, and larger aggregates are able to form almost the maximum number of bp interaction chains and they exist over a wide range of lifetimes as both short-lived and long-lived F

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 5. Effect of temperature on C8T8A8G8 self-assembly at [DNA] = 1.45 mM. Percentage of molecules belonging to various aggregate sizes, s (s = 1 for monomers, s = 2 for dimers, s = 3 for trimers, etc.), as a function of time in natural logarithmic scale for (A) T* = 0.9 and (D) T* = 1.1 reduced temperatures. The aggregate creation time and lifetime for temperatures (B) T* = 0.9 and (E) T* = 1.1. The aggregate lifetime and the mean number of bp interactions per chain during the lifetime for temperatures (C) T = 0.9 and (F) T* = 1.1.

Figure 6. Effect of DNA concentration on C8T8A8G8 self-assembly at T* = 0.9. Percentage of molecules belonging to various aggregate sizes, s (s = 1 for monomers, s = 2 for dimers, s = 3 for trimers, etc.), as a function of time in natural logarithmic scale for DNA concentrations at (A) 1.45 mM, (B) 0.36 mM, and (C) 0.09 mM.

aggregates. Many trimers are long-lived and dimers are equally spread over the 10−16 bp/chain range. This is in contrast to the moderate temperature at T* = 1.0 in Figure 2E wherein two main populations of aggregates coexisting as stable, long-lived dimers and unstable, short-lived larger aggregates. This indicates that base pairing for our self-assembly simulations is highly dependent on temperature, as expected from experimental data.10 At high temperatures such as T* = 1.3, the population of dimers is widely spread over the whole lifetime range, indicating that they were continually created over the course of simulation; however, these dimers contain low numbers of bp interactions per chain at ∼10, which is 6 bp/ chain less than the maximum. This indicates that they are not forming enough bp interactions due to the high kinetic energy of the system. Also, we find trimers are generally short-lived at high temperatures, with most forming ∼5 bp interactions per chain. At moderate temperatures of T* = 0.9 and T* = 1.1, as shown in Figure 5 and T* = 1.0, as shown in Figure 2 (top panel), dimer formation reaches 100% at the end of those simulations (Figure 5A,B). This yield is higher than those at the lowest temperature of T* = 0.7 and the highest temperature at T* = 1.3 (for ease of comparison, all data are also shown for every temperature in Figure S2). This indicates that the moderate temperatures from T* = 0.9 to T* = 1.1 are the optimal temperature range for dimer formation. As the temperature increases from T* = 0.9 to T* = 1.1, large

aggregates are created in smaller populations, in earlier stages of simulation, and over shorter lifetime (Figure 5C,D). Even within this range of temperatures, however, the degree of hybridization by dimers actually changes distinctively as a function of temperature (Figure 5E,F). At T* = 0.9, many dimers are fully hybridized after having formed the maximum number base-pair interactions of 16. Many dimers at T* = 1.0 form up to 15 base-pair interactions and many dimers at T* = 1.1 form up to 13 base-pair interactions. In this case, the stability of those dimers decreases as the temperature increases from T* = 0.9 to T* = 1.1. Similar temperature-dependent behavior of DNA selfassembly can be found for RAND32, as shown in Figure S3 for all temperatures from T* = 0.7 to T* = 1.3. However, the formation of large aggregates is more substantial by RAND32 (Figure S3A−E) compared to that by C8T8A8G8 (Figure S2). At T* = 0.07, ∼50% of the strands of the RAND32 system form dimers and the rest of strands form larger aggregates (Figure S3A) compared to just ∼10% strands of the C8T8A8G8 system that form larger aggregate by the end of simulations. Although RAND32 can form 100% dimers over the moderate temperatures T* = 0.9−1.1 (Figure S3B-D), RAND32 forms significantly more large aggregates including tetramers and pentamers than C8T8A8G8 during self-assembly. Moreover, many dimers of the RAND32 system at T* = 1.0−1.1 are partially hybridized, having formed just 3−5 base-pair interactions (Figure S3M-N). Furthermore, the dimers in the G

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B RAND32 system at T* = 1.1−1.3 are unstable, as shown by the ensemble of dimers with different lifetime spans even when they were created at the same time (Figure S3I,J). This applies to many dimers that were created at any given time during selfassembly (Figure S3I,J). In contrast, the dimers in the C8T8A8G8 at T* = 1.1−1.3 are stable, as shown by the relatively clean diagonal line in Figure S2I,J, which indicates that at any given time a dimer is created, that dimer remains stable until the end of each simulation. 3.3. Effect of DNA Concentration on Self-Assembly. We evaluated the effect of concentration on DNA self-assembly through studying three different DNA concentrations: 1.45, 0.36, and 0.09 mM at T* = 0.9 for C8T8A8G8, as seen in Figure 6 and Figure S4. For a DNA concentration of 1.45 mM, dimers start forming by ln[t* + 1] ∼ 0.25, with the formation of trimers and tetramers nearly instantaneous thereafter (Figure 6A). This is caused by free monomers starting to form bp interactions with partially hybridized dimers. This monomeric attachment to the aggregates leads to the formation of larger aggregates such as pentamers and hexamers. These large aggregates dissociate by the end of the simulation and form 100% of dimers. As the concentration is lowered by a factor of 4−0.36 mM, the creation of dimers is delayed compared to the case at higher concentration, as seen in Figure 6B occurring at ln[t* + 1] ∼ 0.5. The formation of trimers starts at ln[t* + 1] ∼ 1, and tetramers start forming at ln[t* + 1] ∼ 2.5. Both trimers and tetramers dissociate by the end of the simulation in favor of dimers. As the concentration is reduced further by another factor of 4 to 0.09 mM, the dimer lag time is increased to ln[t* + 1] ∼ 1.0 (Figure 6C). Trimers and tetramers exist at this concentration; however, they account for less than only 2% of the molecules at most during the simulation (Figure S4A−C). Over the initial period from the beginning until ln[t* + 1] = 2.0, dimer formation reaches ∼60% for the concentration of 1.45 mM, ∼40% for 0.36 mM, and ∼20% for 0.09 mM. This indicates that the formation of dimers as a function of concentration does not follow well-defined relationships in which the dimer formation rate corresponds to the square of the concentration. Otherwise, reducing the concentration by a factor of 4 should decrease the rate of dimer formation by a factor of 16 instead of just a factor of 1.5. This suggests that dimer formation is more complicated than just a simple bimolecular reaction involving multiple pathways, as shown in Figure 3B and discussed in subsection 3.4 below. In Figure S4D−F, we plot the aggregate creation times and lifetimes. All three concentrations form a set of stable dimers initially, as seen as clusters of blue dots representing dimers in the upper left region of each scatterplot. Lag times for dimers are also observed in the same region as the amount of points in that region decreases with decreasing concentration. Also trimers form near the beginning of the simulation but have short lifetimes for all concentrations. It is important to note the decreasing count of trimers and tetramers as the concentration decreases. Dimers are formed throughout the simulation due to the dissociation of the larger aggregates as seen along the y = x line. Base pairing plays an important role for aggregate lifetimes, as observed in Figure S4G−I. For all three concentrations, they form two distinct sets of stable dimers, those that have ∼16 bp interactions per chain and the metastable dimers that have 2 bp interactions per chain, as seen previously at low and moderate temperatures. Moreover, at moderate to high concentrations, trimers have ∼10 bp interactions per chain, therefore having

less than the theoretical maximum of 16 bp interactions per chain chain. In addition, the tetramers at these concentrations form 5−10 bp interactions per chain. These low values of bp interactions per chain result in the aggregates not being longlived and not taking full enthalpic advantage of the available base pairing. It is therefore enthalpically favorable to form full duplexes as these are capable of forming 16 bp interactions per chain. Similar concentration-dependent behavior is found at T* = 1.1 for C8T8A8G8, as seen in Figure S5. It shows that as the DNA concentration decreases, the formation of large aggregates decreases to the point where dimers are created by only hybridization of monomeric strands without through the strand displacement mechanism from metastable trimers (Figure S5A−C). However, there are two major differences on self-assembly between T* = 0.9 and T* = 1.1. On one hand, trimers, tetramers or larger aggregates at T* = 1.1 completely disappear when the concentration is reduced to [DNA] = 0.36 mM (Figure S5D−F), unlike the case at T* = 0.9 where trimers still exist even at [DNA] = 0.09 mM. On the other hand, the maximum number of bp interactions per chain in dimers is 13 at T* = 1.1 compared to 16 at T* = 0.9. 3.4. Kinetic Modeling of dsDNA Self-Assembly at Different Temperatures and Concentrations. Above we have shown that there are three mechanisms for DNA hybridization that can occur during self-assembly: zippering, slithering, and strand displacement. The last is a hybridization mechanism that had not been explored during self-assembly in large systems using molecular dynamics even though it is responsible for the dynamic nature of DNA molecular machines and DNA computing devices. Because there are experimental results on the kinetics of hybridization available for comparison,63,64 we attempted to extract the reaction rate constants from our simulation data by using a kinetic fitting method as described below. To capture all three mechanisms, we consider only the strands that are involved in the formation or dissociation of dimers and/or trimers. Although there are larger aggregates such as tetramers, pentamers, and larger aggregates that could form during self-assembly, inclusion of such large aggregates requires a more complicated kinetic fitting method without any statistic certainty due to the relatively small population of those large aggregates compared to large populations of dimers and trimers. Therefore, the formation of dimers and trimers can be written with the following reactions: k1f

(1)

M+M→D k1b

(2)

D→M+M k 2f

(3)

M+D→T k 2b

(4)

T→M+D

where M is the monomer species, D is dimer, and T is trimer. Assuming a batch operation gives the following set of differential mass balance equations: d[M] = −k1f [M]2 + k1b[D] − k 2f [D][M] + k 2b[T] dt (5)

d[D] = k1f [M]2 − k1b[D] − k 2f [D][M] + k 2b[T] dt H

(6)

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B d[T] = k 2f [D][M] − k 2b[T] dt

molecules that are monomers, dimers, or trimers as a function of time within the error bars that were calculated from filtered data. However, filtering the chains that were involved exclusively in dimer-related and trimer-related reactions from the simulations at T* = 0.7 yields few data, resulting in poor fitting and high uncertainty of the rate constant values; therefore, these rate constants at T* = 0.7 are omitted in this section. The fitted k-values are plotted as ln(k) in a relationship with 1/T* in Figure 7C,D for the first rate equation, k1f and k1b, respectively. The values for ln(k1f) have an increasing slope with regard to 1/T* (Figure 7C), meaning that the forward rate constant decreases with temperature, which was also observed in previous simulation studies.33,45 This trend indeed follows Arrhenius kinetics, therefore having a negative activation energy (EA).65 The k1f values were calculated for the tetrablock system at three different concentrations of DNA (as already discussed above) and it was found that the average activation energy is −50.4 kJ/mol with a relatively small standard deviation of ±4.03 kJ/mol. Our activation energy is similar to values obtained by experimentalists63,64 and another simulation study using a coarse-grained model by Ouldridge et al.;45 their activation energies are in the range ∼−32 to ∼−75 kJ/mol. Additionally, the pre-exponential factor was found to be 1.63 × 10−4 ± 4.15 × 10−7 per % strands per unit time. Unlike the k1f case, the values for ln(k1b) follow nonArrhenius kinetics (Figure 7D). This can be explained by examining the effect of temperature on the backward reaction. Over the moderate temperatures of T* = 0.9−1.1, the values of k1b are extremely small in the range ln(k1b) = −20 to −30, which indicates that the dissociation of dimers into monomers is unfavorable, as dimers are strongly stable and no monomers are left at the end of each simulation. As the temperature is elevated at T* = 1.3, the value of k1b increases significantly due to the instability of dimers at high temperatures, but it is still smaller than its k1f value. This indicates the dissociation of dimers into monomers is allowed but not as favorable as the forward reaction. Therefore, such a binary effect of temperature on the backward reaction results in its non-Arrhenius kinetics. Additional rate constants associated with the strand displacement for C8T8A8G8 are shown in Table 3 at T* = 1.0 and all

(7)

Using this set of differential equations (5)−(7), we implemented a least-squares regression on the simulation data (the amounts of monomers, dimers, and trimers as a function of time) to find the four rate constants (k1f, k1b, k2f, k2b). Specifically, we guessed the values of those four rate constants until the difference between the left side and right side of eqs 5−7 is minimized. This approach is applied to each simulation time point over the whole course of simulations and three DNA concentrations. Algorithmically, we used a set of python libraries (NumPy, SciPy, and Lm-fit) to numerically solve the coupled differential equations and perform kinetic fitting of the rate constants. Lm-fit is an extension of Scipy’s fitting method to put bounds on the rate constants by allowing the solver to use only values of k1f,b > 0 or k2f,b > 0. For each temperature ranging from T* = 0.9 to T* = 1.3 of the C8T8A8G8 system, we considered only the chains that were involved exclusively in dimer-related and trimer-related reactions during the time cource of our simulations, which yields 60% of the original 100 chains per simulation. An example of the unfiltered kinetics is available in Figure 7A for

Table 3. Rate Constants of C8T8A8G8 at T* = 1.0

Figure 7. Kinetic fits for C8T8A8G8 at [DNA] = 1.45 mM. (A) Percentage of molecules belonging to various aggregate sizes, s (s = 1 for monomers, s = 2 for dimers, s = 3 for trimers, etc.), as a function of time in natural logarithmic scale at T* = 1.0. (B) Percentage of filtered molecules belonging to various aggregate sizes, s, as a function of time in natural logarithmic scale. (C) Value of k1f in natural logarithmic scale as a function of inverse temperature. The fitted values for the linearized Arrhenius equation is shown as the red line. (D) Value of k1b in natural logarithmic scale as a function of inverse temperature.

k1f [per % strands per unit time] k1b [per unit time] k2f [per % strands per unit time] k2b [per unit time]

(1.918 (4.561 (5.767 (5.747

± ± ± ±

0.049) 0.117) 0.148) 0.147)

× × × ×

10−03 10−10 10−04 10−1

three DNA concentrations considered in this study. Compared to the rate constant for the formation of dimers, the rate constant for the formation of trimers is lower by 3.3 times. However, once a trimer is formed, it preferentially dissociates into a fully hybridized duplex and a displaced strand with a rate constant that is at least 2 orders of magnitude higher than that of the formation of dimers from single strands. This agrees with our previous analysis on the lifetime amount and number of bp interactions for aggregates of different sizes (in subsection 3.1 and Figure 2E) showing that trimers and larger aggregates are short-lived species compared to dimers as long-lived species with higher number of bp interactions. Such an energetic difference is the driving force for the strand displacement

T* = 1.0, the filtered version after rescaling to 100% is shown as the dark lines in Figure 7B with error bar calculations from 20 simulations. Using our rate equations and the data collected for each temperature, we were able to obtain values for the rate constants. The kinetically fitted values of the percentage of molecules as monomers, dimers or trimers, which were recalculated using the obtained rate constants, are shown as the dashed lines in Figure 7B. The fit is very close to our simulation values and stays within the error bars of the standard deviation calculated from 20 simulations. Figures S6 and S7 are two other examples at T* = 0.9 and T* = 1.1 showing that our kinetic fitting procedure can reproduce the percentage of I

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B mechanism that takes place spontaneously even at isothermal conditions without being aided by any external stimuli.

4. CONCLUSIONS In this work, we use a novel coarse-grained model to study the effects of sequence, temperature, and DNA concentration on the self-assembly process. Both sequences of interest, C8T8A8G8 and RAND32, have the same G−C content but the sequence pattern plays a large role in dictating the kinetic pathways of aggregate formation. Due to the lower propensity of the C8T8A8G8 system to make non-native contacts, this system is more effective in forming less transient aggregates than the RAND32 system. Moreover, we also explored the effect of temperature on self-assembly. At low temperatures, large aggregates form in high populations especially in the RAND32 system; however, at high temperatures these aggregates are nonexistent due to the high kinetic energy of system, thus destabilizing their bp interactions. It was determined that the best isothermal self-assembly temperature for both systems is T* = 1.0. This is due to the minimization of large aggregate formation and the maximization of long-lived fully hybridized dimers. Furthermore, our model was able to capture the concentration effects on DNA self-assembly. It was found at high concentrations, monomers would form dimers and then trimers and larger aggregates during the early stage of simulation (t* < 200). These aggregates then dissociate into more enthalpically favorable dimers. At low concentrations, both systems can minimize the formation of trimers and larger aggregates; in this case, the formation of dsDNA is mainly from hybridization of ssDNA. Finally, we applied kinetic modeling to mathematically capture the effects of temperature and concentration on the self-assembly process. Our system was found to have a negative activation energy similar to those found in experiments63,64 and prior simulation.45 Our coarse-grained model, simulation method, and aggregate analysis described in this work gave us insight into the individual aggregates enabling us to identify their lifetime, creation time, and degree of hybridization. These techniques potentially help us gain a fundamental understanding of the effect of environmental condition and sequence on selfassembly in an effort to further increase assembly yields and performance of many interesting systems. These techniques can be useful for structural DNA nanotechnology wherein the construction of two- and three-dimensional objects of varying sizes and complexity is achieved. Furthermore, they are also suitable in dynamic DNA nanotechnology wherein dynamic structures play an important role as reconfigurable and autonomous devices. In this case, our techniques can capture the nonequilibrium dynamics in addition to the equilibrium end states in producing molecular logic circuits, catalytic amplifiers, autonomous molecular walkers, and reprogrammable DNA nanostructures.





and effect of DNA concentration on hybridization in three sequence systems (PDF) Pathway 2 of Figure 3B (MPG) Pathway 3 of Figure 3B (MPG)

AUTHOR INFORMATION

Corresponding Author

*H. D. Nguyen. E-mail: [email protected]. Phone: 949-824-6589. Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The computational studies used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation (OCI1053575). We are also grateful for the funding given by National Science Foundation (CAREER CBET-1554508) and the computational resources provided by the High Performance Computing cluster and GreenPlanet at the University of California, Irvine.



ABBREVIATIONS DMD, discontinuous molecular dynamics; CG, coarse-grained; bp interaction, base-pair interaction; dsDNA, double stranded DNA; ssDNA, single stranded DNA



REFERENCES

(1) Lee, C. S.; Davis, R. W.; Davidson, N. A Physical Study by Electron Microscopy of the Terminally Repetitious, Circularly Permuted DNA From the Coliphage Particles of Escherichia Coli 15. J. Mol. Biol. 1970, 48 (1), 1−22. (2) Meselson, M. S.; Radding, C. M. A General Model for Genetic Recombination. Proc. Natl. Acad. Sci. U. S. A. 1975, 72 (1), 358−361. (3) Seeman, N. C. Nucleic Acid Junctions and Lattices. J. Theor. Biol. 1982, 99 (2), 237−247. (4) Douglas, S. M.; Marblestone, A. H.; Teerapittayanon, S.; Vazquez, A.; Church, G. M.; Shih, W. M. Rapid Prototyping of 3D DNA-Origami Shapes with caDNAno. Nucleic Acids Res. 2009, 37 (15), 5001−5006. (5) Zadeh, J. N.; Wolfe, B. R.; Pierce, N. A. Nucleic Acid Sequence Design via Efficient Ensemble Defect Optimization. J. Comput. Chem. 2011, 32 (3), 439−452. (6) Goodman, R. P.; Schaap, I. A. T.; Tardin, C. F.; Erben, C. M.; Berry, R. M.; Schmidt, C. F.; Turberfield, A. J. Rapid Chiral Assembly of Rigid DNA Building Blocks for Molecular Nanofabrication. Science 2005, 310 (5754), 1661−1665. (7) Ke, Y.; Liu, Y.; Zhang, J.; Yan, H. A Study of DNA Tube Formation Mechanisms Using 4-, 8-, and 12-Helix DNA Nanostructures. J. Am. Chem. Soc. 2006, 128 (13), 4414−4421. (8) Rothemund, P. W. K. Folding DNA to Create Nanoscale Shapes and Patterns. Nature 2006, 440 (7082), 297−302. (9) Zheng, J.; Birktoft, J. J.; Chen, Y.; Wang, T.; Sha, R.; Constantinou, P. E.; Ginell, S. L.; Mao, C.; Seeman, N. C. From Molecular to Macroscopic via the Rational Design of a Self-Assembled 3D DNA Crystal. Nature 2009, 461 (7260), 74−77. (10) Marmur, J.; Doty, P. Thermal Renaturation of Deoxyribonucleic Acids. J. Mol. Biol. 1961, 3 (5), 585−594. (11) Wetmur, J. G.; Davidson, N. Kinetics of Renaturation of DNA. J. Mol. Biol. 1968, 31 (3), 349−370. (12) McCarthy, B. J.; Church, R. B. The Specificity of Molecular Hybridization Reactions. Annu. Rev. Biochem. 1970, 39, 131−150.

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jpcb.6b03937. Figures showing the percentage of molecules belonging to various aggregate sizes, aggregate creation time and lifetime, aggregate lifetime and the mean number of bp interactions per chain, contact map or probability of forming a contact, effect of temperature on hybridization, J

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Atomistic Simulation and Single-Molecule Experiment. J. Chem. Theory Comput. 2014, 10 (8), 2891−2896. (35) Uusitalo, J. J.; Ingólfsson, H. I.; Akhshi, P.; Tieleman, D. P.; Marrink, S. J. Martini Coarse-Grained Force Field: Extension to DNA. J. Chem. Theory Comput. 2015, 11, 3932. (36) Li, T. I. N. G.; Sknepnek, R.; Macfarlane, R. J.; Mirkin, C. A.; de la Cruz, M. O. Modeling the Crystallization of Spherical Nucleic Acid Nanoparticle Conjugates with Molecular Dynamics Simulations. Nano Lett. 2012, 12 (5), 2509−2514. (37) Ding, F.; Sharma, S.; Chalasani, P.; Demidov, V. V.; Broude, N. E.; Dokholyan, N. V. Ab Initio RNA Folding by Discrete Molecular Dynamics: From Structure Prediction to Folding Mechanisms. RNA 2008, 14 (6), 1164−1173. (38) Sambriski, E. J.; Schwartz, D. C.; de Pablo, J. J. A Mesoscale Model of DNA and Its Renaturation. Biophys. J. 2009, 96 (5), 1675− 1690. (39) Markegard, C. B.; Fu, I. W.; Reddy, K. A.; Nguyen, H. D. Coarse-Grained Simulation Study of Sequence Effects on DNA Hybridization in a Concentrated Environment. J. Phys. Chem. B 2015, 119 (5), 1823−1834. (40) Fu, I. W.; Markegard, C. B.; Chu, B. K.; Nguyen, H. D. The Role of Electrostatics and Temperature on Morphological Transitions of Hydrogel Nanostructures Self-Assembled by Peptide Amphiphiles via Molecular Dynamics Simulations. Adv. Healthcare Mater. 2013, 2 (10), 1388−1400. (41) Fu, I. W.; Markegard, C. B.; Chu, B. K.; Nguyen, H. D. Role of Hydrophobicity on Self-Assembly by Peptide Amphiphiles via Molecular Dynamics Simulations. Langmuir 2014, 30 (26), 7745− 7754. (42) Fu, I. W.; Markegard, C. B.; Nguyen, H. D. Solvent Effects on Kinetic Mechanisms of Self-Assembly by Peptide Amphiphiles via Molecular Dynamics Simulations. Langmuir 2015, 31 (1), 315−324. (43) Fu, I. W.; Nguyen, H. D. Sequence-Dependent Structural Stability of Self-Assembled Cylindrical Nanofibers by Peptide Amphiphiles. Biomacromolecules 2015, 16 (7), 2209−2219. (44) Smith, S.; Hall, C.; Freeman, B. Molecular Dynamics for Polymeric Fluids Using Discontinuous Potentials. J. Comput. Phys. 1997, 134 (1), 16−30. (45) Ouldridge, T. E.; Sulc, P.; Romano, F.; Doye, J. P. K.; Louis, A. A. DNA Hybridization Kinetics: Zippering, Internal Displacement and Sequence Dependence. Nucleic Acids Res. 2013, 41 (19), 8886−8895. (46) Hinckley, D. M.; Freeman, G. S.; Whitmer, J. K.; de Pablo, J. J. An Experimentally-Informed Coarse-Grained 3-Site-Per-Nucleotide Model of DNA: Structure, Thermodynamics, and Dynamics of Hybridization. J. Chem. Phys. 2013, 139 (14), 144903. (47) Alder, B. J.; Wainwright, T. E. Studies in Molecular Dynamics. I. General Method. J. Chem. Phys. 1959, 31 (2), 459−466. (48) Andersen, H. Molecular Dynamics Simulations at Constant Pressure and/or Temperature. J. Chem. Phys. 1980, 72 (4), 2384− 2393. (49) Nguyen, H. D.; Hall, C. K. Molecular Dynamics Simulations of Spontaneous Fibril Formation by Random-Coil Peptides. Proc. Natl. Acad. Sci. U. S. A. 2004, 101 (46), 16180−16185. (50) Fink, A. L. Protein Aggregation: Folding Aggregates, Inclusion Bodies and Amyloid. Folding Des. 1998, 3 (1), R9−R23. (51) Jahn, T. R.; Radford, S. E. The Yin and Yang of Protein Folding. FEBS J. 2005, 272 (23), 5962−5970. (52) Bourdenx, M.; Koulakiotis, N. S.; Sanoudou, D.; Bezard, E.; Dehay, B.; Tsarbopoulos, A. Protein Aggregation and Neurodegeneration in Prototypical Neurodegenerative Diseases: Examples of Amyloidopathies, Tauopathies and Synucleinopathies. Prog. Neurobiol. 2015, DOI: 10.1016/j.pneurobio.2015.07.003. (53) Shin, J.-S.; Pierce, N. A. A Synthetic DNA Walker for Molecular Transport. J. Am. Chem. Soc. 2004, 126 (35), 10834−10835. (54) Bath, J.; Turberfield, A. DNA Nanomachines. Nat. Nanotechnol. 2007, 2 (5), 275−284. (55) Ouldridge, T. E.; Hoare, R. L.; Louis, A. A.; Doye, J. P. K.; Bath, J.; Turberfield, A. J. Optimizing DNA Nanotechnology Through

(13) Zhang, D. Y.; Seelig, G. Dynamic DNA Nanotechnology Using Strand-Displacement Reactions. Nat. Chem. 2011, 3 (2), 103−113. (14) Aksimentiev, A.; Heng, J. B.; Timp, G.; Schulten, K. Microscopic Kinetics of DNA Translocation Through Synthetic Nanopores. Biophys. J. 2004, 87 (3), 2086−2097. (15) Noy, A.; Pérez, A.; Laughton, C. A.; Orozco, M. Theoretical Study of Large Conformational Transitions in DNA: the B a Conformational Change in Water and Ethanol/Water. Nucleic Acids Res. 2007, 35 (10), 3330−3338. (16) Kannan, S.; Zacharias, M. Simulation of DNA Double-Strand Dissociation and Formation During Replica-Exchange Molecular Dynamics Simulations. Phys. Chem. Chem. Phys. 2009, 11 (45), 10589−10595. (17) Lillian, T. D.; Taranova, M.; Wereszczynski, J.; Andricioaei, I.; Perkins, N. C. A Multiscale Dynamic Model of DNA Supercoil Relaxation by Topoisomerase IB. Biophys. J. 2011, 100 (8), 2016− 2023. (18) Yoo, J.; Aksimentiev, A. In Situ Structure and Dynamics of DNA Origami Determined Through Molecular Dynamics Simulations. Proc. Natl. Acad. Sci. U. S. A. 2013, 110 (50), 20099−20104. (19) Knotts, T. A.; Rathore, N.; Schwartz, D. C.; de Pablo, J. J. A Coarse Grain Model for DNA. J. Chem. Phys. 2007, 126 (8), 084901. (20) Ouldridge, T. E.; Johnston, I. G.; Louis, A. A.; Doye, J. P. K. The Self-Assembly of DNA Holliday Junctions Studied with a Minimal Model. J. Chem. Phys. 2009, 130 (6), 065101. (21) Maciejczyk, M.; Spasic, A.; Liwo, A.; Scheraga, H. A. CoarseGrained Model of Nucleic Acid Bases. J. Comput. Chem. 2010, 31 (8), 1644−1655. (22) Gopal, S. M.; Mukherjee, S.; Cheng, Y.-M.; Feig, M. PRIMO/ PRIMONA: a Coarse-Grained Model for Proteins and Nucleic Acids That Preserves Near-Atomistic Accuracy. Proteins: Struct., Funct., Genet. 2010, 78 (5), 1266−1281. (23) Dans, P. D.; Zeida, A.; Machado, M. R.; Pantano, S. A Coarse Grained Model for Atomic-Detailed DNA Simulations with Explicit Electrostatics. J. Chem. Theory Comput. 2010, 6 (5), 1711−1725. (24) Savelyev, A.; Papoian, G. A. Chemically Accurate Coarse Graining of Double-Stranded DNA. Proc. Natl. Acad. Sci. U. S. A. 2010, 107 (47), 20340−20345. (25) Xia, Z.; Gardner, D. P.; Gutell, R. R.; Ren, P. Coarse-Grained Model for Simulation of RNA Three-Dimensional Structures. J. Phys. Chem. B 2010, 114 (42), 13497−13506. (26) Morriss-Andrews, A.; Rottler, J.; Plotkin, S. S. A Systematically Coarse-Grained Model for DNA and Its Predictions for Persistence Length, Stacking, Twist, and Chirality. J. Chem. Phys. 2010, 132 (3), 035105. (27) DeMille, R. C.; Cheatham, T. E., III; Molinero, V. A CoarseGrained Model of DNA with Explicit Solvation by Water and Ions. J. Phys. Chem. B 2011, 115 (1), 132−142. (28) Biyun, S.; Cho, S. S.; Thirumalai, D. Folding of Human Telomerase RNA Pseudoknot Using Ion-Jump and TemperatureQuench Simulations. J. Am. Chem. Soc. 2011, 133 (50), 20634−20643. (29) Ouldridge, T. E.; Louis, A. A.; Doye, J. P. K. Structural, Mechanical, and Thermodynamic Properties of a Coarse-Grained DNA Model. J. Chem. Phys. 2011, 134 (8), 085101. (30) Hsu, C. W.; Fyta, M.; Lakatos, G.; Melchionna, S.; Kaxiras, E. Ab Initio Determination of Coarse-Grained Interactions in DoubleStranded DNA. J. Chem. Phys. 2012, 137 (10), 105102. (31) Cragnolini, T.; Derreumaux, P.; Pasquali, S. Coarse-Grained Simulations of RNA and DNA Duplexes. J. Phys. Chem. B 2013, 117 (27), 8047−8060. (32) He, Y.; Maciejczyk, M.; Ołdziej, S.; Scheraga, H.; Liwo, A. Mean-Field Interactions Between Nucleic-Acid-Base Dipoles Can Drive the Formation of a Double Helix. Phys. Rev. Lett. 2013, 110 (9), 098101. (33) Hinckley, D. M.; Lequieu, J. P.; de Pablo, J. J. Coarse-Grained Modeling of DNA Oligomer Hybridization: Length, Sequence, and Salt Effects. J. Chem. Phys. 2014, 141 (3), 035102. (34) Maffeo, C.; Ngo, T. T. M.; Ha, T.; Aksimentiev, A. A CoarseGrained Model of Unstructured Single-Stranded DNA Derived From K

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B Coarse-Grained Modeling: a Two-Footed DNA Walker. ACS Nano 2013, 7 (3), 2479−2490. (56) Benenson, Y.; Gil, B.; Ben-Dor, U.; Adar, R.; Shapiro, E. An Autonomous Molecular Computer for Logical Control of Gene Expression. Nature 2004, 429 (6990), 423−429. (57) Lederman, H.; Macdonald, J.; Stefanovic, D.; Stojanovic, M. N. Deoxyribozyme-Based Three-Input Logic Gates and Construction of a Molecular Full Adder. Biochemistry 2006, 45 (4), 1194−1199. (58) Qian, L.; Winfree, E. A Simple DNA Gate Motif for Synthesizing Large-Scale Circuits. J. R. Soc., Interface 2011, 8 (62), 1281−1297. (59) Douglas, S. M.; Bachelet, I.; Church, G. M. A Logic-Gated Nanorobot for Targeted Transport of Molecular Payloads. Science 2012, 335 (6070), 831−834. (60) Qian, L.; Winfree, E. Scaling Up Digital Circuit Computation with DNA Strand Displacement Cascades. Science 2011, 332 (6034), 1196−1201. (61) Srinivas, N.; Ouldridge, T. E.; Sulc, P.; Schaeffer, J. M.; Yurke, B.; Louis, A. A.; Doye, J. P. K.; Winfree, E. On the Biophysics and Kinetics of Toehold-Mediated DNA Strand Displacement. Nucleic Acids Res. 2013, 41 (22), 10641−10658. (62) Machinek, R. R. F.; Ouldridge, T. E.; Haley, N. E. C.; Bath, J.; Turberfield, A. J. Programmable Energy Landscapes for Kinetic Control of DNA Strand Displacement. Nat. Commun. 2014, 5, 5324. (63) Craig, M. E.; Crothers, D. M.; Doty, P. Relaxation Kinetics of Dimer Formation by Self Complementary Oligonucleotides. J. Mol. Biol. 1971, 62 (2), 383−401. (64) Pörschke, D.; Eigen, M. Co-Operative Non-Enzymatic Base Recognition III. Kinetics of the HelixCoil Transition of the Oligoribouridylic · Oligoriboadenylic Acid System and of Oligoriboadenylic Acid Alone at Acidic pH. J. Mol. Biol. 1971, 62 (2), 361− 381. (65) Revell, L. E.; Williamson, B. E. Why Are Some Reactions Slower at Higher Temperatures? J. Chem. Educ. 2013, 90 (8), 1024−1027.

L

DOI: 10.1021/acs.jpcb.6b03937 J. Phys. Chem. B XXXX, XXX, XXX−XXX