Assessment of the Sampling Performance of Multiple-Copy Dynamics

Sep 6, 2016 - The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several short...
0 downloads 0 Views 4MB Size
Article pubs.acs.org/jcim

Assessment of the Sampling Performance of Multiple-Copy Dynamics versus a Unique Trajectory Juan J. Perez,† M. Santos Tomas,‡ and Jaime Rubio-Martinez*,§ †

Department of Chemical Engineering, Universitat Politecnica de Catalunya, Av. Diagonal 647, E-08028 Barcelona, Spain Department of Architecture Technology, Universitat Politecnica de Catalunya, Av. Diagonal 649, E-08028 Barcelona, Spain § Department of Physical Chemistry, University of Barcelona and the Institut de Recerca en Quimica Teorica i Computacional (IQTCUB), Marti i Franques 1, E-08028 Barcelona, Spain ‡

S Supporting Information *

ABSTRACT: The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several shorter ones starting from different points in the phase space and covering the same sampling time. For this purpose, we selected the 16-mer peptide Bak16BH3 as a model for study and carried out several samplings in explicit solvent. These samplings included an 8 μs trajectory (sampling S1); two 4 μs trajectories (sampling S2); four 2 μs trajectories (sampling S3); eight 1 μs trajectories (sampling S4); 16 0.5 μs trajectories (sampling S5), and 80 0.1 μs trajectories (sampling S6). Moreover, the 8 μs trajectory was further extended to 16 μs to have reference values of the diverse properties measured. The diverse samplings were compared qualitatively and quantitatively. Among the former, we carried out a comparison of the conformational profiles of the peptide using cluster analysis. Moreover, we also gained insight into the interchange among these structures along the sampling process. Among the latter, we computed the number of new conformational patterns sampled with time using strings defined from the conformations attained by each of the residues in the peptide. We also compared the locations and depths of the obtained minima on the free energy surface using principal component analysis. Finally, we also compared the helical profiles per residue at the end of the sampling process. The results suggest that a few short molecular dynamics trajectories may provide better sampling than one unique trajectory. Moreover, this procedure can also be advantageous to avoid getting trapped in a local minimum. However, caution should be exercised since short trajectories need to be long enough to overcome local barriers surrounding the starting point and the required sampling time depends on the number of degrees of freedom of the system under study. An effective way to gain insight into the minimum MD trajectory length is to monitor the convergence of different structural features, as shown in the present work.



energy conformations.6−8 Among other techniques,9−12 molecular dynamics (MD) has been demonstrated to be an adequate procedure to sample the conformational space of peptides.13 However, conformational transitions in molecular systems are often slow in comparison with the time scales accessible to simulation, so information about relevant conformational states may be poorly represented in many cases. Several techniques have been proposed in the literature to enhance sampling efficiency, ranging from multiple-time-step methods14 to reduction of the number of degrees of freedom15 and modifications of the potential energy surface as in umbrella sampling,16 accelerated dynamics,17 or metadynamics.18 On the other hand, computer architecture has evolved towards massive parallelization, so multiple-copy dynamics methods have also been developed to explore the conformational space. The most obvious procedure is to run diverse trajectories in parallel to explore different areas of the conformational space. This in

INTRODUCTION In contrast to most proteins, short peptides do not normally exhibit a native structure in solution. However, the conformational profile of a peptide cannot be described as a random coil. This corresponds to a model where each residue explores the entire region of the sterically allowed Ramachandran plot irrespective of the behavior and the steric properties of their nearest neighbors.1 Indeed, compelling spectroscopic evidence shows that residues in the peptide chain are more structurally ordered than the random coil model predicts2 and that unfolded peptides exhibit local order.3 This evidence supports the picture that the structure of a peptide in solution is the average of an ensemble of states that can individually be characterized by a combined effort of spectroscopic and computational methods.4 One of the challenges for characterizing the ensemble of states of a peptide in solution at room temperature using computational methods regards the convergence of the sampling process.5 Sampling of the energy surface of a peptide is a difficult task because of its rugged nature with many low© XXXX American Chemical Society

Received: June 14, 2016

A

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

CPU and GPU versions and the ff14SB force field.33 The peptide in its zwitterionic form was soaked in a cubic box of TIP3P waters34 in such a way that the shortest distance between the peptide and the boundaries of the box was larger than 15 Å. Water molecules closer than 2.2 Å to any of the peptide atoms were removed. Next, the system was neutralized with counterions following a grid-shaped procedure for mapping the electrostatic potential surface. The structure was energy-minimized with 1000 iterations of the steepest-descent method, but the positions of the main atoms were fixed by imposing harmonic restraints. The minimized structure was heated to 300 K at a constant rate of 30 K/10 ps while the harmonic restraints were maintained for the main atoms. Once the system was heated, two steps of 20 ps at constant pressure with the same restraints were calculated to increase the density of the system. Temperature coupling constants of 1.0 and 2.0 ps were applied during the first and second steps, respectively. Then, 5 ns of molecular dynamics was computed in the NVT ensemble at a temperature of 300 K, also with the same restraints. Thus, to this point the systems were forced to retain their initial helical structure because only side chains were allowed to move. Next, the restrictions on the main atoms were gradually removed in seven steps of 200 ps length, each one to allow the whole system to move freely. Long-range electrostatic interactions were computed using the particle-mesh Ewald summation method,35 and the nonbonded cutoff distance was set at 10 Å. The SHAKE algorithm36 was used to constrain bonds involving hydrogen atoms, permitting the use of an integration time of 2 fs. All of the above calculations were performed at constant temperature by coupling the systems to a thermal bath using Berendsen’s algorithm37 with a time coupling constant of 0.2 ps. Finally, different production molecular dynamic simulations were done under the canonical ensemble using the Langevin thermostat with a collision frequency of 2 ps−1 for temperature control. The diverse analyses described below were carried out using one snapshot every other picosecond, corresponding to 500 snapshots/ns, i.e. 8 × 106 snapshots for the 16 μs MD trajectory. However, no changes were observed for any of the studied properties when using one snapshot every 20 ps.

practice can be done by starting from the same structure and changing the initial velocities of the atoms.19 Another possibility is to couple the different trajectories exchanging information. This category includes methods that produce a canonical sampling, such as the ensemble dynamics method,20 or a noncanonical one, such as the replica-exchange molecular dynamics method.21 The use of uncoupled multiple copies of short MD trajectories in contrast to computing a long one is a common practice of modeling practitioners as a guarantee of better sampling.22−25 However, only a few studies have been devoted to analysis of their differential performance. In a pioneering study devoted to the dynamic behavior of crambin, Caves at al.19 compared the performance of 10 trajectories of 120 ps with a trajectory in the nanosecond range. The authors concluded that multiple short trajectories starting at different points of the phase space achieve more efficient sampling than a single long trajectory. A few years later, an analysis of the theoretical speedup produced when a set of trajectories is used to sample the conformational space of a peptide suggested a net improvement in regard a unique long MD trajectory.20 More recently, Genheden and Ryde26 found that MM/GBSA calculations are more accurate when sampling is carried out using several short trajectories compared with a long MD simulation, although caution should be taken in assessing the minimal length of the multiple trajectories to get meaningful results, as has been previously suggested.27 The present work describes the results of a study designed to understand the differential performance exhibited by a long MD trajectory compared with a set of shorter uncoupled MD simulations in sampling the conformational space of a 16residue peptide. We selected for the present study the peptide with sequence GQVGRQLAIIGDDINR (Bak16BH3), which corresponds to the 16-residue BH3 segment of Bak and acts as inhibitor of the pro-apoptotic protein Bcl-xL.28,29 The study involved the analysis of an 8 μs MD trajectory and its comparison with the results obtained using diverse sets of shorter uncoupled trajectories starting from different points of the phase space. In the present work, we investigated the case where all of the trajectories used the same starting structure but different distributions of initial velocities. Specifically, we selected as the starting structure the α-helix conformation the peptide attains when bound to Bcl-xL.30 The sets of uncoupled shorter MD trajectories studied in the present work included two 4 μs trajectories, four 2 μs trajectories, eight 1 μs trajectories, 16 0.5 μs trajectories, and 80 0.1 μs trajectories. Apart from the structure of the peptide bound to Bcl-xL,30 the only structural information available for the peptide in solution is a circular dichroism spectrum in 30% trifluoroethanol, where the peptide exhibits 22% helix content.31 Because of the lack of more detailed structural information and in order to have a reference to discuss the performance of diverse samplings, the 8 μs trajectory was extended to 16 μs with the same conditions.



RESULTS AND DISCUSSION As mentioned above, the purpose of this work was to carry out a comparative analysis of the performances of a long MD trajectory and several sets of shorter trajectories that cover the same sampling time with respect to the exploration of the conformational space of a peptide. For this purpose, we computed an 8 μs MD trajectory of the hexadecapeptide Bak16BH3 in its zwitterionic form in water (sampling S1) and compared the results with those for diverse sets of smaller trajectories computed with the same conditions and starting structure but with different initial velocities. Thus, in the phase space representation, all trajectories began with the same coordinates (x)⃗ but with different momenta (p⃗). The sampling trajectories computed in this work included two 4 μs trajectories (sampling S2), four 2 μs trajectories (sampling S3), eight 1 μs trajectories (sampling S4), 16 0.5 μs trajectories (sampling S5), and 80 0.1 μs trajectories (sampling S6). The results of this comparison are reported below. Moreover, in order to have a reference to understand the performance of the diverse samplings, the 8 μs trajectory was extended to 16 μs with the same conditions.



METHODS The starting structure of Bak16BH3 used for the different MD simulations was retrieved from the Bak16BH3/Bcl-xL complex solved by NMR spectroscopy30 (entry 1BXL of the Protein Data Bank), where the Bak16BH3 peptide is bound to the hydrophobic groove of the Bcl-xL protein exhibiting a welldefined helical structure. All of the calculations were done with the Amber 14 software32 using the PMEMD program in its B

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 1. Ribbon representations of the most characteristic structures sampled by the Bak16BH3 peptide in the present study: (a) representative structure of cluster #1 (see the text) exhibiting a helix−turn−helix structure; (b) representative structure of cluster #2 exhibiting a helical segment at the N-terminus; (c) representative structure of cluster #3 exhibiting a folded structure with a few turns; (d) representative structure of cluster #4 exhibiting a helical structure.

Analysis of the 16 μs MD Trajectory. In order to obtain a qualitative understanding of the conformational domains attained by the peptide along the trajectory, we classified the snapshots sampled using cluster analysis. Specifically, the rootmean-square deviation (rmsd) of the backbone α-carbons was used as a similarity measure between two structures, and then the structures were grouped using a hierarchical clustering algorithm. In order to get a holistic view of the conformational domains attained by the peptide, we inspected the results of the clustering process at three different cutoff distances that resulted in 30, 15, and five clusters. Analysis of the clusters obtained suggests that independently of the number of clusters considered, the structures can be roughly grouped into four categories: helix−turn−helix, covering the largest number of snapshots (cluster #1); structures with a helical segment at the N-terminus (cluster #2); folded structures with a few turns (cluster #3); and helical structures (cluster #4). Figure 1 shows representative structures of these four groups. A more quantitative evaluation of the sampling performance can be obtained through the analysis of the number of unique conformational patterns visited along the sampling process. Conformational patterns are defined by a string of letters (one for each residue of the peptide chain in consecutive order) coded according to the values of the backbone φ and ψ angles. In the present work, we used the Define Secondary Structure of Proteins (DSSP) method38 as implemented in the Amber 14 program.32 Analysis of the 16 μs MD trajectory permitted the identification of 153 055 unique conformational patterns. Figure 2 shows pictorially the evolution of the number of unique patterns sampled during the sampling process. As can be seen, there is a clear linear increase in the number of new patterns, although after 4 μs the slope becomes smaller. More

Figure 2. Time evolution of the cumulative number of new patterns sampled during the 16 μs MD trajectory.

specifically, the number of unique patterns increases linearly during the first 3 μs; from this point the number of new patterns sampled is scarce for almost 1 μs, followed by a period of a linear increase of new patterns, although with a smaller slope than found previously. A new plateau is found at 6.5 μs, where the peptide remains for almost 1 μs. Then for 0.5 μs there is an important increase in new patterns to go back to a linear increase of similar slope as found before. Later, close to 13 μs, there is a new plateau that lasts almost 2 μs to finish with a new period of linear increase. Plateaus can be associated with periods where the peptide remains in a basin or goes back and forth between regions already sampled. It is important to note that structures found in the 3 μs plateau are of the helix−turn− C

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 3. (a) Projection of the free energy surface of the Bak16BH3 peptide obtained from the 16 μs MD trajectory onto the first two principal components. (b−f) Projections of structures belonging to (b) cluster #1, (c) cluster #2, (d) cluster #3, (e) cluster #4, and (f) cluster #5 onto the plane of the first two principal components.

Structures grouped in cluster #3 cover this low-energy basin, as can be seen in Figure 3d. Its representative structure is found at coordinates (3.2, −1.1) and can be described as partially folded with a few turns, as shown pictorially in Figure 1c. The next low-energy minimum in increasing order is at 1.0 kcal/mol and lies at approximate coordinates (5.0, −7.0). Structures grouped in cluster #2 cover this low-energy basin, as can be seen in Figure 3c. Its representative structure is found at coordinates (2.5, −7.0) and can be described as helical at the N-terminus, as shown in Figure 1b. The next low-energy minimum appears at 1.1 kcal/mol, and it is located at approximate coordinates (−6.0, −5.0). Structures grouped in cluster #5 cover this lowenergy basin as can be seen in Figure 3f. Its representative structure is found at coordinates (−6.7, −5.1) and can be described as a distorted helix−turn−helix. The next low-energy minimum appears at 1.2 kcal/mol, and it is located at approximate coordinates (−6.0, 10.0). Structures grouped in cluster #4 cover this low-energy basin, as can be seen in Figure 3e. Its representative structure is precisely found at coordinates (−4.1, 10.7) and can be described as a helix, as shown pictorially in Figure 1d. The time evolution of the residue secondary structure profile of the peptide was also monitored and is shown in Figure 4. Inspection of the plot suggests that after the first nanoseconds the peptide departs from the helical structure with a significant loss of helicity at the N-terminus, although the peptide again attains this conformation after 10 μs and at the end of the simulation. On the other hand, after 3 μs the peptide adopts a helix−turn−helix structure for almost 1 μs and appears there again at 7, 10, and 14 μs. The peptide also adopts other structures, including a folded conformation with turns at 5 and

helix type, suggesting that this is the time required to jump into this basin. Principal component analysis was also carried out to understand the features of the free energy surface (FES) produced by the sampling process and, more interestingly, its evolution with time. For this purpose, we superimposed all of the snapshots to a structure of reference and computed the covariance matrix of backbone displacements with regard to the mean structure. The matrix was subsequently diagonalized to obtain the principal components together with their eigenvalues, with the two first components (PC1 and PC2) recovering 39.7% of the total variance. In a subsequent step, the coordinates of all of the structures were projected onto the plane defined by the two principal components with the highest eigenvalues (PC1 and PC2), and the result is shown pictorially in Figure 3a. The map represents an approximation of the FES of an equilibrated ensemble. Dark regions correspond to the most frequently sampled regions and represent minima on the FES. In order to understand the structural features of the lowenergy conformations, we compared the coordinates of the minima with those of the representative members of the clusters using the 30-cluster partition calculation projected onto the (PC1, PC2) map. Inspection of the FES suggests a lowenergy basin with two minima at approximate coordinates of (12.0, 4.0) and (11.0, 1.0). This basin coincides well with the area where the snapshots of cluster #1 are located as shown in Figure 3b. The representative member of cluster #1 is located very close to those minima at point (11.0, 0.5), and these structures can be described as a helix−turn−helix, as shown in Figure 1a. The next low-energy minimum in increasing order is at 0.9 kcal/mol and lies at approximate coordinates (3.0, −1.0). D

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

analysis of the MD trajectory snapshots shows that the peptide attains in addition to the helix−turn−helix three other types of structure: folded with a few turns, helical at the N-terminus segment, and helical. Interestingly, the helix−turn−helix is the most populated cluster. PCA further supports this result. Although the preferred conformation of the peptide in solution is a helix−turn−helix, the conformation it adopts when bound to Bcl-xL is also characterized as a minimum-energy structure about 1.3 kcal/mol above the global minimum. In simplistic terms, the emerging picture of the peptide in solution is a dynamic equilibrium between the helix and helix−turn− helix conformations, being the latter much more populated. Moreover, partially helical conformations as well as those described as folded with a few turns, could be considered intermediate states. In fact, there are several indications that it takes the peptide more than 3 μs to begin to explore thoroughly the helix−turn−helix basin in the reference MD trajectory. Comparative Analysis of the Diverse 8 μs Samplings. We now compare the six independent 8 μs samplings carried out in this work to gain insight into their differential performance in regard to the properties described above. The 16 μs MD trajectory is used as the reference. Cluster Analysis. Despite the fact that cluster analysis provides only a qualitative picture, it is sometimes enough to gain insight into the structural propensities of a peptide. Roughly speaking, the diverse 8 μs independent MD trajectories show groups of structures similar to those described for the reference MD trajectory. However, the populations of the diverse groups of structures are not the same. The diverse structures attained by the peptide (rows) and the corresponding cluster numbers for each of the samplings (columns) are summarized in Table 1. In order to interpret these results it should be borne in mind that clusters are ordered in decreasing population so that cluster #1 is the most populated whereas the last one is the least populated. As can be seen, the helix−turn−helix is the most populated cluster in samplings S1, S2, and S5, whereas it appears in second place in samplings S4 and S6 and in the third place in sampling S3. On the other hand, helical structures appear as the most populated cluster in sampling S6 and in second place for samplings S3 and S5. The group classified as folded with a few turns appears as the most populated cluster in samplings S3 and S4 and the second most populated cluster for samplings S1 and S2. Finally, partially helical structures appear as the less populated clusters. It can be concluded that the diverse samplings give similar information in regard to the possible structures attained by the peptide but fail to provide quantitative information about the most populated structure. Conformational Patterns. The time evolution of new patterns visited in the diverse cumulative samplings is linear, as observed previously for the reference MD trajectory. Figure 6 shows pictorially the number of new patterns sampled in the diverse 8 μs samplings. As can be seen, except for sampling S3, the rest show about 20% higher number of patterns. Comparison of the number of common patterns among the diverse samplings range from 45% between samplings S2 and S3 to 72% between samplings S1 and S3. Interestingly, the number of common patterns is higher among the short trajectories used for the cumulative sampling, ranging from 75% between the 4 μs trajectories to 95% between the 0.1 μs trajectories, except in the case of sampling S3, where the percentage of common patterns is only about 60% (see Table

Figure 4. Time evolution of the secondary structure motifs of the Bak16BH3 peptide per residue along the 16 μs MD trajectory. Secondary motifs are color-coded: red for a bend conformation, brown for a turn, dark green for a π-helix, green for an α-helix, bluish green for a 3-10 helix, light blue for an antiparallel β-sheet, gray for a parallel β-sheet, and dark blue for no structure.

13 μs and helical at the N-terminus at 12 μs. Complementary information can be obtained from an inspection of the cumulative secondary structure histograms obtained in the sampling process. As shown in Figure S1 in the Supporting Information, after 2 μs the peptide exhibits a high tendency to form a helical structure at the C-terminus. Two microseconds later, residues at the N-terminus exhibit an increased helical structure, whereas the residues in the central segment remain in a bend conformation as a result of the helix−turn−helix adopted by the peptide after 3 μs. From this point the cumulative secondary diagrams show similar profiles. Thus, we can say that this property is converged at least for this 16 μs length trajectory. These results suggest that the peptide exhibits a clear tendency to adopt helical structures. Indeed, the ensemble average of the helical structure propensity, shown in Figure 5, indicates that both the N-terminal and C-terminal segments exhibit a high tendency to adopt a helical structure whereas residues in the middle show a reduced tendency, suggesting that the helicity of the ensemble is a helix−turn−helix. This agrees well with the cluster analysis and PCA results. Cluster

Figure 5. Histogram representing the cumulative secondary structure per residue obtained during the 16 μs MD trajectory. Individual bars show the percentages of diverse secondary motifs attained by specific residues during the simulation. Green represents a bend, yellow a turn, dark blue a π-helix, light blue an α-helix, red a 3-10 helix, purple an antiparallel β-sheet, and orange a parallel β-sheet. E

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 1. Cluster Numbers Representing the Different Types of Structures in the Diverse 8 μs Samplingsa helix−turn−helix N-terminal helical folded w/ turns helical

reference (16 μs)

S1 (1 × 8 μs)

S2 (2 × 4 μs)

S3 (4 × 2 μs)

S4 (8 × 1 μs)

S5 (16 × 0.5 μs)

S6 (80 × 0.1 μs)

1 2 3 4

1 7 2 8

1 5 2 3

3 11 1 2

2 5 1 4

1 6 5 2

2 7 8 1

a

For every sampling, cluster numbers are defined in consecutive order. Clusters are constructed in such a way that they are ordered according to decreasing population, so cluster #1 is the most populated whereas the last one is the least populated.

different starting momenta, the phase space sampled will appear as overlapping circles that in the limit will cover a rectangular section of phase space with a width equal to 2 times the radius of the short trajectory (Figure 7b). Obviously, this region of phase space will be different from the one obtained if the 16 μs MD is split into segments of 0.1 μs, since each segment will have different starting coordinates and different starting momenta, and consequently, it cannot be used to compare the amount of sampling. According to the geometrical analysis described above, in order to sample the accessible space using multiple trajectories it is necessary that the trajectories have a minimum threshold radius equal to half the length of the accessible momenta, as shown in Figure 7c. If the space is sampled with MD simulations longer than the threshold time, the space will be more thoroughly explored (red circle in Figure 7c), but under the assumption that the trajectory path follows a continuous line, it will bounce off the boundary of the accessible phase space line, and consequently, the sampling may not be as efficient as that carried out using multiple trajectories with a threshold radius. This model suggests that it is important to determine the threshold radius to do an efficient sampling, and this depends on the number of degrees of freedom of the system under study. The Free Energy Surface. For comparison purposes, all of the maps are referred to the same set of vectors, which are defined taking into account all of the information on the diverse trajectories computed in the present work. The FES obtained in sampling S1 is shown in Figure 8a. It exhibits a low-energy basin with two minima at approximate coordinates of (12.0, 4.0) and (9.0, 0.0). This basin coincides well with the area where the snapshots of cluster #1 are located. Indeed, the representative member of cluster #1 is located at point (10.8, 2.5) and can be described as a helix−turn−helix. The next lowenergy minimum in increasing order is at 1.1 kcal/mol and lies at approximate coordinates (3.0, −1.0). Structures grouped in cluster #2 cover this low-energy basin. Its representative structure is precisely found at the same coordinates and can be described as partially folded with a few turns. The next lowenergy minimum appears at 1.2 kcal/mol, and it is located at approximate coordinates (6.0, −2.0). Structures of cluster #7 are spread around this basin in the PC1−PC2 plane and can be described as helical at the N-terminus. Its representative structure is found at coordinates (6.9, −1.0). The next lowenergy minimum in increasing order is located at approximate coordinates (2.0, 10.0) and is 1.4 kcal/mol above the lowestenergy minimum. Structures belonging to cluster #3 are spread around this basin. Its representative structure is found at coordinates (1.3,10.7) and can be described as partially folded with a few turns. The next minimum appears at approximate coordinates (5.0, 7.0) and is 1.5 kcal/mol above the lowestenergy minimum. Structures within this low-energy basin belong to cluster #4. The representative structure is found at

Figure 6. Time evolution of the cumulative number of new patterns sampled during the diverse samplings: sampling S1 (1 × 8 μs) in blue; sampling S2 (2 × 4 μs) in red; sampling S3 (4 × 2 μs) in brown; sampling S4 (8 × 1 μs) in green; sampling S5 (16 × 0.5 μs) in purple; sampling S6 (80 × 0.1 μs) in orange.

S1 in the Supporting Information). These results suggest that the cumulative sampling obtained by merging several shorter MD trajectories produces a higher number of patterns than a unique MD trajectory, with the exception of sampling S3, which shows anomalous behavior, as will be discussed later. In view of the results described above, one could argue about the convenience of carrying out this type of analysis because of the apparent lack of convergence of the numbers of new patterns sampled. However, since peptides do not behave as random coils,2 the conformational space, and consequently the number of patterns, must necessarily be limited. In order to learn more about the evolution of new patterns sampled using many short trajectories, we analyzed more deeply the convergence behavior observed when using many 0.1 μs MD trajectories. For this purpose we considered the cumulative number of new patterns obtained in sampling S6 plus the 30 independent 0.1 μs trajectories that correspond to the first 100 ns of each trajectory in samplings S2, S3, S4, and S5. As shown in Figure S5, the evolution of the cumulative number of new patterns exhibits an asymptotic behavior at around 148 trajectories. In order to gain insight, in qualitative terms, into the fraction of the conformational space sampled, we can use a geometrical analysis. If the accessible phase space of the peptide is pictured as a square, then an MD trajectory could be represented by a circle whose radius is related to the sampling time. Figure 7a shows diagrammatically two trajectories of different lengths using the same starting coordinates (x⃗) but different starting momenta (p⃗). In this representation, all trajectories that have the same starting coordinates but use different starting momenta are located on a straight line (shown in red in Figure 7a) that has been drawn at the midpoint of the ordinate axis for convenience. Accordingly, if we compute diverse short trajectories using the same starting coordinates but with F

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Figure 7. continued

can be compared to the phase space sampled using a long MD simulation, shown in green. (c) The threshold length MD trajectories are required to have in order to sample the accessible phase space of the peptide using multiple trajectories is shown in green; the phase space covered by an MD trajectory longer than the threshold time is shown in red.

coordinates (4.5, 6.2) and can be described as a distorted helix−turn−helix. Finally, there is a shallow minimum 1.6 kcal/ mol above the lowest-energy minimum at approximate coordinates (−7.0, 4.0). Structures within this low-energy well belong to cluster #8. Its representative structure is found at coordinates (−6.4, 4.0), and can be described as a helical structure close to the starting one. Figure 8b displays the FES obtained after combining the two 4 μs trajectories (sampling S2). As can be seen, the most populated basin is the one located at coordinates (10.0, 0.0) that corresponds to the lowest-energy minimum found in sampling S1. Snapshots of this basin can roughly be grouped in cluster #1, whose representative structure a helix−turn−helix, located at (10.1, 0.8). The map also shows a well 0.5 kcal/mol above the lowest-energy minimum and located at (12.0, −6.0) that does not correspond to any of the minima found in the reference trajectory. Structures around this basin are grouped in cluster #2, whose representative member is located at (12.5, −5.2) and can be described as a folded structure with a few turns. Other low-energy wells also included in the projection of structures corresponding to cluster #1 can be identified. These can be described as distorted helix−turn−helix structures. One is located at (8.0, −4.0) and is 0.5 kcal/mol above the lowestenergy minimum, and another is located at (7.0, 1.0) and is 0.7 kcal/mol above the global minimum. At (−3.0, 7.0) there is a shallow minimum that corresponds to helical structures and can be grouped under cluster #3, whose representative member is located at (−11.5, 8.4). Comparison of the two individual 4 μs trajectories projected onto (PC1, PC2) with the cumulative sampling FES (see Figure S2) reveals that both individual trajectories exhibit a minimum at (10.0, 0.0) with different depths. Similarly, the minimum located at (8.0, −4.0) is shown in both trajectories. In contrast, the minima located at (12.0, −6.0), (7.0, 1.0), and (−3.0. 7.0) are only visible in one of the MD trajectories. Figure 8c displays the cumulative sampling FES obtained after combining the four 2 μs trajectories (sampling S3). As can be seen, the most populated basin is located at coordinates (12.0, −6.0) and does not match any of the minima described in sampling S1, although it was also identified in sampling S2. Snapshots of this basin are classified under cluster #1, whose representative structure is located at coordinates (12.6, −4.8) and (as said before) can be described as a folded structure with a few turns. The map also reveals a minimum located at (12.0, 6.0) that is 0.8 kcal/mol above the lowest-energy minimum, corresponding to the lowest-energy minimum of the reference FES. Analysis of the plot also reveals a minimum at (8.0, −4.0) that is covered by the structures of cluster #6, whose representative structure is at (8.5, −5.7) and can be described as a helical at the N-terminus. In addition, a shallow minimum at (−6.0, 5.0) can also be identified that corresponds to helical structures, classified under cluster #2, whose representative structure is located at (−9.1, 6.7). Comparison of the four individual 2 μs trajectories projected onto (PC1, PC2) with the

Figure 7. Schematic representation of the phase space sampled by diverse molecular dynamics trajectories. In each panel, the rectangle represents the accessible phase space of a peptide, whereas circles of different radius represent the maximum phase space that can be sampled by MD trajectories of diverse lengths. A curly line within a circle represents a specific trajectory of a specific length, and each “x” represents a snapshot taken during the trajectory. The red line represents all of the possible starting points where the peptide has same initial coordinates (x⃗) but different initial momenta (p⃗). (a) The phase space that can be covered using a short MD trajectory is shown in blue, and it can be compared to the phase space sampled using a long MD simulation, shown in green. (b) The phase space that can be covered using multiple short MD trajectories is shown in blue, and it G

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 8. Free energy surfaces of the Bak16BH3 peptide obtained from the diverse samplings studied in the present work. (a) sampling S1 (1 × 8 μs); (b) sampling S2 (2 × 4 μs); (c) sampling S3 (4 × 2 μs); (d) sampling S4 (8 × 1 μs); (e) sampling S5 (16 × 0.5 μs); (f) sampling S6 (80 × 0.1 μs).

minima that are also included in cluster #1. Specifically, 0.4 kcal/mol above the lowest-energy minimum is a minimum at (8.0, −4.0), and there is another at 0.6 kcal/mol at (5.0, −2.0), both of which can be described as helix−turn−helix structures. There is a minimum located at (−5.0, 5.0), close to the starting point that corresponds to a large number of snapshots of structures that exhibit helical structures and are included in cluster #2. Finally, there is another minimum located at coordinates (−7.0, −6.0). These structures are included in cluster #9 and exhibit a helical segment at the N-terminus. Figure 8f displays the cumulative FES obtained after combining the 80 individual 0.1 μs trajectories. As can be seen, the 80 × 0.1 μs FES exhibits a completely different distribution of low-energy wells in comparison with the 8 μs trajectory (Figure 8a). Although the lowest-energy minimum is sampled, the most important sampling occurs around the starting structure. Specifically, the most populated basin encloses helical structures similar to the starting one. These structures are grouped together in cluster #1. There are two clear minima with similar energies: one located at coordinates (−13.0, 2.0) and the other at coordinates (−4.0, 6.0). Other minima include one located at (6.0, −6.0) with structures grouped in cluster #2 that corresponds to a distorted helix− turn−helix, another at coordinates (7.0, 3.0) with structures grouped in cluster #8 that corresponds to folded structures with several turns, and another at coordinates (9.0, 0.0) that is poorly populated and corresponds to the global minimum of sampling S1. Comparison of the FESs produced in the diverse samplings shows interesting differences. Samplings S1 and S2 (Figure 8a,b, respectively) exhibit the global minimum at (10.0, 0.0) as in the reference MD trajectory (Figure 3a). However, the latter exhibits a minimum at (12.0, −6.0) that is not found in the

cumulative sampling FES (see Figure S3) reveals that all four trajectories clearly exhibit the minimum at (12.0, −8.0), whereas the rest of the minima appear with different relative importances in each of the trajectories. Figure 8d displays the cumulative FES obtained after combining the eight individual 1 μs trajectories (sampling S4). The map shows a broad basin in the area around (10.0, 0.0). Specifically, the most frequently sampled basin is located at coordinates (9.0, −4.0) and can be associated with a helix− turn−helix similar to the lowest-energy minimum of sampling S1. Structures in this basin are included in cluster #1, whose representative member is located at coordinates (6.5, −2.2). There is also another basin around (10.0, 3.0) that is 0.1 kcal/ mol above the lowest-energy minimum. Structures in this basin are included in cluster #2, whose representative member is located at (9.9, 4.7) and can be described as a distorted helix− turn−helix. Moreover, a shallow basin can also be identified at (−14.0, −1.0) at 1.0 kcal/mol that is included in cluster #4, whose representative member is located at (−14.2, −4.0) and can be described as a helical structure. Comparison of the eight individual 1 μs trajectories projected onto (PC1, PC2) with the cumulative sampling FES (see Figure S4) shows diverse sampling profiles. Thus, the minimum at (9.0, −4.0) appears in seven of the eight maps, the minimum at (10.0, 3.0) in five of the eight maps, and the minimum at (6.0, −2.0) in four of the eight maps. Figure 8e displays the cumulative FES obtained after combining the 16 individual 0.5 μs trajectories. The most populated basin is located at coordinates (7.0, 1.0) and corresponds to helix−turn−helix structures similar to the lowest-energy minimum of the 8 μs trajectory. Snapshots of this basin are included in cluster #1, whose representative structure is located at coordinates (7.2, 1.0). In this basin there are other H

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 9. Time evolution of the FES obtained in the 8 μs MD trajectory. Shown are the FES at (a) 0.1 μs, (b) 0.5 μs, (c) 1 μs, (d) 2 μs, (e) 3 μs, (f) 4 μs, (g) 5 μs, (h) 6 μs, and (i) 7 μs.

reference map and can be considered a kinetic trap. Interestingly, this minimum appears as the lowest-energy minimum in the map produced in sampling S3 (Figure 8c), suggesting that the sampling may be biased because of a kinetic trap. Furthermore, the FES produced in sampling S4 (Figure 8d) shows a broad basin in the area of the helix−turn−helix. The FES produced in sampling S5 (Figure 8e) shows a broad basin in the area of the helix−turn−helix as does the previous one, and it also shows low-energy wells in the area of the helical structures. Finally, the FES produced in sampling S6 (Figure 8f) shows a broad basin in the area of the helical structures. It is also interesting to analyze the evolution of the FES along the sampling process of the 8 μs MD trajectory. Figure 9 shows the corresponding maps generated at different times of the sampling process. At 100 ns (Figure 9a), sampling is performed around the starting structure. At 500 ns (Figure 9b), the peptide makes excursions to other regions of the map, including the helix−turn−helix structures basin, although most of the time the peptide remains exploring the region around the starting structure. After 1 μs (Figure 9c), the system has explored the helix−turn−helix basin more deeply, although the

global minimum has not been explored yet. The same trend continues during the following 2 μs. Maps at 2 and 3 μs (Figure 9d,e) suggest that the peptide samples diverse kinetic traps in the neighborhood of the global minimum, since although they appear as minima after this time, they will disappear after a longer sampling. In contrast, a qualitative change is observed after 4 μs (Figure 9f). Interestingly, this period corresponds roughly to the first plateau observed in the evolution of conformational patterns, suggesting that the peptide gets trapped in this basin in such a way that the relevance of previous minima fades away. After 5 μs (Figure 9g), the system explores the second low-energy basin around (3.0, −1.0), and the FES already reveals all of the features observed at 8 μs. In the forthcoming period between 6 and 7 μs (Figure 9h,i) the FES remains qualitatively the same; however, in the period between 7 and 8 μs, sampling is enhanced again around the helix−turn−helix structure, changing the relative importance of diverse minima in the final picture of the FES. This period coincides with the second plateau observed in the evolution of new unique patterns shown in Figure 2. As stated previously, this 8 μs MD trajectory was extended up to 16 μs in order to I

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 10. Histograms representing the cumulative secondary structures per residue obtained during the diverse samplings carried out in this work. Individual bars show the percentages of diverse secondary motifs attained by a specific residue during the simulation. Green represents a bend, yellow a turn, dark blue a π-helix, light blue an α-helix, red a 3-10 helix, purple an antiparallel β-sheet, and orange a parallel β-sheet. (a) sampling S1 (1 × 8 μs); (b) sampling S2 (2 × 4 μs); (c) sampling S3 (4 × 2 μs); (d) sampling S4 (8 × 1 μs); (e) sampling S5 (16 × 0.5 μs); (f) sampling S6 (80 × 0.1 μs).

differences per residue as shown in Figure 11. As can be seen, none of the samplings provide the same profile as the reference MD trajectory. In the C-terminus, sampling S1 yields a slight lower helicity profile, whereas the rest of the samplings yield higher helicity profiled. In the intermediate segment, sampling S1 shows a poorer helical profile, whereas sampling S2 yields the correct profile and the rest yield higher helical profiles. Finally, in regard to the N-terminus, samplings S1, S2, and S3 yield the correct helical profile, whereas those for the rest are higher. Another interesting result is that samplings S4, S5, and

have a reference to compare with all of the different samplings performed. The evolution of the FES from 8 to 16 μs is shown in Figure S6. It can be observed that the FES remains converged during the last microseconds. However, the possibility of having more enhanced sampling in a more extended MD trajectory cannot be excluded. Peptide Helicity. The helical profile is another structural property studied. The diverse panels of Figure 10 display the helicity profiles obtained in the diverse samplings. Comparison with the reference sampling was done by computing the J

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

conformational space of a 16-residue peptide using a long MD trajectory compared with a set of short uncoupled MD simulations. The study involves the analysis of an 8 μs MD trajectory to be compared to the results from two 4 μs trajectories, four 2 μs trajectories, eight 1 μs trajectories, 16 0.5 μs trajectories, and 80 0.1 μs trajectories. All of the trajectories started from the same structure with different distributions of the initial velocities. Different features were studied, including a cluster analysis of the snapshots, the number of unique patterns sampled, PCA and the corresponding FES, and the helicity profile. The average helicity profile of the reference calculation indicates that both the N-terminal and C-terminal segments exhibit a high tendency to adopt a helical structure, whereas residues in the middle show a reduced tendency. This suggests that the most populated structure of the peptide in solution is a helix−turn−helix.39 This finding is supported by the low helix content of the peptide found in the CD spectrum in 30% trifluoromethanol.31 Moreover, this could also be relevant for its inhibitory capacity, as discussed elsewhere.29 Cluster analysis shows that in addition, the peptide attains folded structures with a few turns, helical structures at the N-terminus, and helical structures. These types of structures are found in the diverse samplings performed. However, in quantitative terms the present results suggest that samplings S1 and S2 provide similar descriptions as the reference 16 μs MD trajectory. Samplings S4 and S5 are not as good as the reference sampling: although they show many of its features, they fail to provide good helicity profiles. In contrast, samplings S3 and S6 clearly exhibit poorer performance. The abnormal result of sampling S3 appears to be due to fact that 2 μs MD trajectories seem to be too short for an adequate sampling in this system. The failing of sampling S6 could be due to the short duration of the MD trajectories, which forces the system to sample in the neighborhood of the starting point. In regard to the performance of one trajectory versus a few shorter ones, the present results show that many short MD trajectories yield more conformational patterns than a unique MD trajectory. However, this does not necessarily mean better sampling. The graphical representation of the sampled phase space carried out with the different sampling methodologies described above (Figures 7) displays the maximum phase space sampled with a set of trajectories with a specific length. Moreover, the diagram indicates that a minimum length that depends on the number of degrees of freedom of the system is required to sample the accessible space using a set of MD trajectories. Comparing the sampling performance of one trajectory versus a set of shorter trajectories, we should distinguish two cases. On the one hand is the case of a set of MD trajectories with length shorter than the threshold value (Figure 7b). In this case, the two samplings will differ unless the long MD trajectory follows a path parallel to the red line of Figure 7a. On the other hand, we can consider the case of a set of MD trajectories with length equal to the threshold value or longer (Figure 7c). In this case, the multiple trajectory procedure appears to be more effective, since it is expected that MD trajectories bounce at the boundaries of the accessible phase space, and this will be larger for longer MD trajectories, producing a less efficient sampling. In regard to the rest of properties studied, samplings S1 and S2 provide similar information in regard to the clusters and FES, but if we look at the helicity profile, S2 provides better information than S1.

Figure 11. Differences in the percentages of helical content for each of the residues obtained in the diverse samplings relative to the reference sampling of 16 μs. Orange corresponds to sampling S1 (1 × 8 μs), dark blue to sampling S2 (2 × 4 μs), red to sampling S3 (4 × 2 μs), green to sampling S4 (8 × 1 μs), purple to sampling S5 (16 × 0.5 μs), and light blue to sampling S6 (80 × 0.1 μs).

S6 exaggerate the helical profile of the peptide in all three segments. Sampling S3 shows correct sampling only at the Nterminus, whereas samplings S1 and S2 provide reasonable profiles in comparison to the reference MD trajectory. Analysis of the results suggests that although all of the samplings are able to identify the types of structures the peptide attains, some of them fail in quantitative terms. When the relative populations of the clusters obtained from the snapshots, the FES, and the helicity profiles are taken into account, the present results suggest that samplings S1 and S2 provide similar descriptions as the reference 16 μs MD trajectory. Samplings S4 and S5 are not as good as the reference sampling, although they show many of its features. They basically fail to provide a good helicity profile. In contrast, samplings S3 and S6 clearly show a poorer performance. A comparative analysis suggests that in sampling S3 the peptide gets trapped in a minimum that does not appear in the reference sampling. We should bear in mind that the reference sampling shows a plateau after 3 μs and that sampling S3 is the result of the cumulative sampling from the merging of four 2 μs MD trajectories. We should consider that MD trajectories shorter than 2 μs do not provide good sampling in this case. However, this could be overcome computing a large number of trajectories as deduced from the behavior observed in samplings S4 and S5. On the other hand, sampling S6 is poorer because of the short duration of the MD trajectories. Clearly, in 100 ns the peptide samples basically the region close to the starting structure. In regard to the performance of one trajectory versus a few shorter ones, the present results show that many short MD trajectories systematically sample more unique patterns than only one trajectory (Figure 6). Even in the case that the sampler gets trapped (sampling S3), the sampling is a good as one MD trajectory. However, obtaining a larger number of patterns does not necessarily indicate better sampling, since the number of new patterns sampled follows a linear behavior. Comparison of samplings S1 and S2 shows that they provide similar information in regard to the clusters and FES, but if we look at the helicity profile, sampling S2 provides better information than sampling S1 (Figure 11).



CONCLUSIONS The present work describes the results of a study designed to understand the differential performance in sampling of the K

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

(9) Leach, A. R. A Survey of Methods for Searching the Conformational Space of Small and Medium-Sized Molecules. Rev. Comput. Chem. 1991, 2, 1−55. (10) Hansmann, U. H. E.; Okamoto, Y. New Monte Carlo Algorithms for Protein Folding. Curr. Opin. Struct. Biol. 1999, 9, 177−183. (11) Chou, K. C.; Carlacci, L. Simulated Annealing Approach to the Study of Protein Structures. Protein Eng., Des. Sel. 1991, 4, 661−667. (12) Borrelli, K. W.; Vitalis, A.; Alcantara, R.; Guallar, V. PELE: Protein Energy Landscape Exploration. A Novel Monte Carlo Based Technique. J. Chem. Theory Comput. 2005, 1, 1304−1311. (13) Daura, X. Molecular Dynamics Simulations of Peptide Folding. Theor. Chem. Acc. 2006, 116, 297−306. (14) Schlick, T. Time-Trimming Tricks for Dynamic Simulations: Splitting Force Updates to Reduce Computational Work. Structure 2001, 9, R45−R53. (15) Chen, J.; Im, W.; Brooks, C. L., III. Application of Torsion Angle Molecular Dynamics for Efficient Sampling of Protein Conformations. J. Comput. Chem. 2005, 26, 1565−1578. (16) Beutler, T. C.; van Gunsteren, W. F. The Computation of a Potential of Mean Force: Choice of the Biasing Potential in the Umbrella Sampling Technique. J. Chem. Phys. 1994, 100, 1492−1497. (17) Hamelberg, D.; Mongan, J.; McCammon, J. A. Accelerated Molecular Dynamics: A Promising and Efficient Simulation Method for Biomolecules. J. Chem. Phys. 2004, 120, 11919−11929. (18) Bonomi, M.; Parrinello, M. Enhanced Sampling in the WellTempered Ensemble. Phys. Rev. Lett. 2010, 104, 190601. (19) Caves, L. S. D.; Evanseck, J. D.; Karplus, M. Locally Accessible Conformations of Proteins: Multiple Molecular Dynamics Simulations of Crambin. Protein Sci. 1998, 7, 649−666. (20) Shirts, M. R.; Pande, V. S. Mathematical Analysis of Coupled Parallel Simulations. Phys. Rev. Lett. 2001, 86, 4983−4987. (21) Mitsutake, A.; Sugita, Y.; Okamoto, Y. Generalized-Ensemble Algorithms for Molecular Simulations of Biopolymers. Biopolymers 2001, 60, 96−123. (22) Roy, A.; Hua, D. P.; Ward, J. M.; Post, C. B. Relative Binding Enthalpies from Molecular Dynamics Simulations Using a Direct Method. J. Chem. Theory Comput. 2014, 10, 2759−2768. (23) Shahlaei, M.; Mousavi, A. A Conformational Analysis Study on the Melanocortin 4 Receptor Using Multiple Molecular Dynamics Simulations. Chem. Biol. Drug Des. 2015, 86, 309−321. (24) Wijma, H. J.; Marrink, S. J.; Janssen, D. B. Computationally Efficient and Accurate Enantioselectivity Modeling by Clusters of Molecular Dynamics Simulations. J. Chem. Inf. Model. 2014, 54, 2079− 2092. (25) Venken, T.; Voet, A.; De Maeyer, M.; De Fabritiis, G.; Sadiq, S. K. Rapid Conformational Fluctuations of Disordered HIV-1 Fusion Peptide in Solution. J. Chem. Theory Comput. 2013, 9, 2870−2874. (26) Genheden, S.; Ryde, U. How to Obtain Statistically Converged MM/GBSA Results. J. Comput. Chem. 2010, 31, 837−846. (27) Delgado-Soler, L.; Pinto, M.; Tanaka-Gil, K.; Rubio- Martinez, J. Molecular Determinants of Bim(BH3) Peptide Binding to ProSurvival Proteins. J. Chem. Inf. Model. 2012, 52, 2107−2118. (28) Holinger, E. P.; Chittenden, T.; Lutz, R. J. Bak BH3 Peptides Antagonize Bcl-xL Function and Induce Apoptosis through Cytochrome C-Independent Activation of Caspases. J. Biol. Chem. 1999, 274, 13298−13304. (29) Delgado-Soler, L.; del Mar Orzaez, M.; Rubio-Martinez, J. Structure-Based Approach to the Design of BakBH3Mimetic Peptides with Increased Helical Propensity. J. Mol. Model. 2013, 19, 4305− 4318. (30) Sattler, M.; Liang, H.; Nettesheim, D.; Meadows, R. P.; Harlan, J. E.; Eberstadt, M.; Yoon, H. S.; Shuker, S. B.; Chang, B. S.; Minn, A. J.; Thompson, C. B.; Fesik, S. W. Structure of Bcl-Xl-Bak Peptide Complex: Recognition Between Regulators of Apoptosis. Science 1997, 275, 983−986. (31) Petros, A. M.; Nettesheim, D. G.; Wang, Y.; Olejniczak, E. T.; Meadows, R. P.; Mack, J.; Swift, K.; Matayoshi, E. D.; Zhang, H.; Thompson, C. B.; Fesik, S. W. Rationale for Bcl-xL/Bad Peptide

In conclusion, present results suggest that a few short MD trajectories might provide better sampling than one unique trajectory. Moreover, this could also be convenient because by computing diverse trajectories of limited length the system has more chances to avoid getting trapped in a minimum. However, caution should be exercised since short trajectories need to be long enough to overcome local barriers surrounding the starting point and the required sampling time depends on the number of degrees of freedom of the system under study. An effective way to gain insight into the minimum MD trajectory length involves monitoring the convergence of a structural feature, as described in this work. Finally, it is important to stress that this work describes the results for a particular system and that many different systems must be studied before it will be possible to find a general answer to the problem addressed.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00347. Histogram representing the evolution of the cumulative secondary structure per residue at specific times during the 16 μs MD trajectory (Figure S1); free energy surfaces of the Bak16BH3 peptide projected on the two first principal components obtained from samplings S2, S3, and S4 (Figures S2−S4); evolution of the number of cumulative new patterns as a function of the number of 0.1 μs MD trajectories (Figure S5); convergence of the free energy surface of the Bak16BH3 peptide obtained for the reference 16 μs MD trajectory projected on the two first principal components (Figure S6); and percentages of common unique patterns found between the diverse samplings (Table S1) (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Fax: 34-93-4021231. Notes

The authors declare no competing financial interest.



REFERENCES

(1) Flory, P. J. Statistical Mechanics of Chain Molecules; Wiley: New York, 1969; pp 30−31. (2) Schweitzer-Stenner, R. Conformational Propensities and Residual Structures in Unfolded Peptides and Proteins. Mol. BioSyst. 2012, 8, 122−133. (3) Dyson, H. J.; Wright, P. E. Insights into The Structure and Dynamics of Unfolded Proteins from Nuclear Magnetic Resonance. Adv. Protein Chem. 2002, 62, 311−340. (4) Daura, X.; Glattli, A.; Gee, P.; Peter, C.; Van Gunsteren, W. F. Unfolded State of Peptides. Adv. Protein Chem. 2002, 62, 341−360. (5) Lyman, E.; Zuckerman, D. M. On the Structural Convergence of Biomolecular Simulations by Determination of the Effective Sample Size. J. Phys. Chem. B 2007, 111, 12876−12882. (6) Perez, J. J.; Villar, H.; Arteca, G. Distribution of Conformational Energy Minima in Molecules with Multiple Torsional Degrees of Freedom. J. Phys. Chem. 1994, 98, 2318−2324. (7) Frauenfelder, H.; Leeson, D. T. The Energy Landscape in NonBiological and Biological Molecules. Nat. Struct. Biol. 1998, 5, 757− 759. (8) Wales, D. J. The Energy Landscape as a Unifying Theme in Molecular Science. Philos. Trans. R. Soc., A 2005, 363, 357−377. L

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Complex Formation from Structure, Mutagenesis, and Biophysical Studies. Protein Sci. 2000, 9, 2528−2534. (32) Case, D.; Babin, V.; Berryman, J.; Betz, R.; Cai, Q.; Cerutti, D.; Cheatham, T. E., III; Darden, T.; Duke, R.; Gohlke, H.; Goetz, A.; Gusarov, S.; Homeyer, N.; Janowski, P.; Kaus, J.; Kolossvary, I.; Kovalenko, A.; Lee, T.; LeGrand, S.; Luchko, T.; Luo, R.; Madej, B.; Merz, K.; Paesani, F.; Roe, D.; Roitberg, A.; Sagui, C.; Salomon-Ferrer, R.; Seabra, G.; Simmerling, C.; Smith, W.; Swails, J.; Walker, R.; Wang, J.; Wolf, R.; Wu, X.; Kollman, P. AMBER 14; University of California: San Francisco, 2014. (33) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, J. E.; Simmerling, C. f14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J. Chem. Theory Comput. 2015, 11 (8), 3696−3713. (34) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926−935. (35) Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N.Log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089−10092. (36) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-alkanes. J. Comput. Phys. 1977, 23, 327−341. (37) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; DiNola, A.; Haak, J. R. Molecular-Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684−3690. (38) Kabsch, W.; Sander, C. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 1983, 22, 2577−2637. (39) Owen, M. C.; Strodel, B.; Csizmadia, I. G.; Viskolcz, B. Radical Formation Initiates Solvent-Dependent Unfolding and β-sheet Formation in a Model Helical Peptide. J. Phys. Chem. B 2016, 120, 4878−4889.

M

DOI: 10.1021/acs.jcim.6b00347 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX