Principal Component Analysis of Lipid Molecule Conformational

Jan 14, 2016 - Molecular dynamics simulations of lipid bilayers are ubiquitous nowadays. Usually, either global properties of the bilayer or some part...
1 downloads 3 Views 3MB Size
Article pubs.acs.org/JCTC

Principal Component Analysis of Lipid Molecule Conformational Changes in Molecular Dynamics Simulations Pavel Buslaev,† Valentin Gordeliy,†,‡,⊥ Sergei Grudinin,*,†,#,∇,○ and Ivan Gushchin*,†,⊥ †

Moscow Institute of Physics and Technology, 141700 Dolgoprudniy, Russia Institut de Biologie Structurale J.-P. Ebel, Université Grenoble Alpes-CEA-CNRS, F-38000 Grenoble, France ⊥ Institute of Complex Systems (ICS), ICS-6: Structural Biochemistry, Research Centre Jülich, 52425 Jülich, Germany # Université Grenoble Alpes, LJK, F-38000 Grenoble, France ∇ CNRS, LJK, F-38000 Grenoble, France ○ Inria, F-38000 Grenoble, France ‡

S Supporting Information *

ABSTRACT: Molecular dynamics simulations of lipid bilayers are ubiquitous nowadays. Usually, either global properties of the bilayer or some particular characteristics of each lipid molecule are evaluated in such simulations, but the structural properties of the molecules as a whole are rarely studied. Here, we show how a comprehensive quantitative description of conformational space and dynamics of a single lipid molecule can be achieved via the principal component analysis (PCA). We illustrate the approach by analyzing and comparing simulations of DOPC bilayers obtained using eight different force fields: all-atom generalized AMBER, CHARMM27, CHARMM36, Lipid14, and Slipids and united-atom Berger, GROMOS43A1-S3, and GROMOS54A7. Similarly to proteins, most of the structural variance of a lipid molecule can be described by only a few principal components. These major components are similar in different simulations, although there are notable distinctions between the older and newer force fields and between the all-atom and united-atom force fields. The DOPC molecules in the simulations generally equilibrate on the time scales of tens to hundreds of nanoseconds. The equilibration is the slowest in the GAFF simulation and the fastest in the Slipids simulation. Somewhat unexpectedly, the equilibration in the unitedatom force fields is generally slower than in the all-atom force fields. Overall, there is a clear separation between the more variable previous generation force fields and significantly more similar new generation force fields (CHARMM36, Lipid14, Slipids). We expect that the presented approaches will be useful for quantitative analysis of conformations and dynamics of individual lipid molecules in other simulations of lipid bilayers.

1. INTRODUCTION Molecular dynamics simulations of lipids are ubiquitous nowadays as they provide the information that is complementary to many experimental approaches.1−3 Simulations are used to model a variety of systems, such as micelles, bilayers, vesicles, cubic and other lipidic phases; to model physiological processes, such as membrane pore formation, membrane fusion and fission, and others; and to model the effects of additives, ranging from small molecules to other lipids and membrane proteins. One of the main factors of a successful simulation is employment of a correct set of simulation parameters, including those describing the interactions between the atoms of the simulated system (force field, FF). The lipid FFs are tested against a number of values that can be measured experimentally: densities, surface tensions, elastic moduli, dipole potentials, diffusion data, X-ray diffraction data, NMR deuterium order parameters, and spin relaxation times.4 Most of these properties are macroscopic and provide little © 2016 American Chemical Society

information on the conformations of individual lipid molecules. Perhaps because of this, the parameters that are usually evaluated in molecular dynamics simulations of lipid molecules are also macroscopic: area per lipid, electron density distribution of various chemical groups, and diffusion of a lipid molecule as a whole. The other evaluated properties, such as order parameters, provide the information that is much more local and describes particular moieties of the molecule rather than the whole lipid. In this work, we demonstrate how the conformations of a lipid molecule can be studied with the principal component analysis5 (PCA). We illustrate the approach by analyzing and comparing eight several hundred nanoseconds trajectories of 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) bilayer simulated using eight different FFs. The approach results in comprehensive quantitative characterization of the lipid Received: November 19, 2015 Published: January 14, 2016 1019

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation Table 1. Overview of the Simulations Presented in This Study force field

system

no. of lipids

no. of waters

hydration ratio

A B C27.1 C27.2 C36

72 128 72 288 128

2727 4789 2727 10908 5120

37.9 37.4 37.9 37.9 40

GAFF Berger CHARMM27 CHARMM27 CHARMM36

G43 G54 L S

128 128 128 128

4825 6070 4198 5120

37.7 47.4 32.8 40

GROMOS43A1-S3 GROMOS54A7 Lipid14 Slipids

water model

ensemble

SPC/E SPC TIP3P TIP3P CHARMM TIP3P SPC/E SPC/E TIP3P TIP3P

NPγT NPT NPγT NPγT NPT

duration, ns 500 300 500 125 300

NPT NPT NPT NPT

300 300 2 × 125 300

C36, G43, G54, and S and using the SHAKE algorithm30 in the simulation L. All of the simulations were conducted with the reference temperature of 310 K. Lipid and water molecules were coupled to the temperature bath separately. The simulations A, B, C27.1, and C27.2 were performed using the velocity rescale thermostat31 with the coupling time constant of 0.1 ps−1 and Berendsen barostat32 with the reference pressure of 1 bar and relaxation time of 1 ps. In order to ensure the areas per lipid in agreement with experiment, the systems A, C27.1, and C27.2 were additionally subjected to a surface tension of 22 dyn/cm per interface. The system B was simulated with a semi-isotropic pressure coupling. The compressibility was set to 4.5 × 10−5 bar−1. The simulations C36, G43, G54, and S were performed using the Nosé−Hoover thermostat33,34 with the coupling time constants of 2, 2, 1, and 2 ps−1 and the Parrinello−Rahman barostat35 with the reference pressure of 1 bar and the relaxation times of 1, 1, 5, and 10 ps correspondingly. The simulation L was performed using the Langevin thermostat36 with collision frequency of 1 ps−1 and the Berendsen barostat32 with the reference pressure of 1 bar and relaxation time of 1 ps. In all of the GROMACS simulations, the nonbonded pair list was updated every 10 steps with the cutoff of 1.0 nm. The electrostatic interactions were calculated using the particlemesh Ewald method with a Fourier grid spacing of 0.12 nm in the simulations A, B, C27.1, C27.2, C36, and S, 0.15 nm in the simulations G43 and G54, and 0.1 nm in the simulation L. For the electrostatic and short-range van der Waals interactions, a cutoff distance of 1.0 nm was used. In the simulation G43 the twin-range cutoff at 1.0 and 1.6 nm was used for the van der Waals interactions, and in the simulation G54 the twin-range cutoff at 0.8 and 1.4 nm was used for both the electrostatic and the van der Waals interactions. Analyses. All of the analyses of the DOPC molecule conformations that are presented in this work, unless specified otherwise, are preceded by structure alignment (see Figure 1b for the conformations from the simulation C27.1). To minimize the bias resulting from the choice of a starting structure, the alignment was done in two steps: in the first step the structures were aligned to an arbitrary one, and in the second step the structures were realigned to the average structure obtained in the first step. Dependence of the root-mean-square deviation (RMSD) on time, presented in Figure 2, was calculated as follows

molecule conformations and highlights the similarities and differences in the structures adopted in different simulations.

2. COMPUTATIONAL DETAILS Performed Simulations. Details of the performed simulations are summarized in Table 1. Simulations A, B, and C27.1 were performed using the general AMBER force field,6 a combination of the Berger and GROMOS87 force fields,7,8 and the CHARMM27 force field,9 and the SPC/E,10 SPC,11 and TIP3P12 water models, correspondingly, as described in Siu et al.13 C27.2 was simulated identically to C27.1. Simulations C36, G43, G54, L, and S were performed using the CHARMM36,14 GROMOS 43A1-S3,15 GROMOS 54A7,16 Lipid14,17 and Stockholm lipids18 (Slipids) force fields and the CHARMM TIP3P,19,20 SPC/E,10 SPC/E,10 and TIP3P12 water models, correspondingly. Initial structures and topologies for the simulations A, B, and C27.1 were taken from Lipidbook.21 The structures result from 100 ns simulations by Siu et al.13 of the fully atomistic model prepared by Feller et al.22 and united-atoms model prepared by Böckmann et al.23 Initial structure for the simulation C27.2 was obtained by combining four periodic images of the system C27.1 generated by translating the system in the directions of the axes X and Y parallel to the membrane plane. Initial structures and topologies for the simulations C36, L, and S were prepared using the CHARMM-GUI Membrane Builder.24,25 Initial structures and topologies for the simulations G43 and G54 were obtained from the GROMACS (http://www. gromacs.org/Downloads/User_contributions/Molecule_ topologies) and David Poger (http://compbio.biosci.uq.edu. au/~david/research/topologies/) Web sites correspondingly. These structures result from 60 and 300 ns simulations by Chiu et al.15 and Poger et al.16 Simulation Conditions. All of the simulations except for L were performed using the GROMACS software26 versions 4.5 and 4.6, and the simulation L was performed using sander of the AmberTools14 software suite.27 Parameters in all of the simulations were chosen so as to match those employed in the original reports: by Siu et al.13 for the simulations A, B, C27.1, and C27.2, by Klauda et al.14 for C36, by Chiu et al.15 for G43, by Poger et al.16 for G54, by Dickson et al.17 for L, and by Jämbeck and Lyubartsev18 for S. In all of the simulations the integration was performed using the leapfrog integrator with the step size of 2 fs. The system snapshots were collected every 10 ps. Periodic boundary conditions were applied in all three directions. The centers of mass of the monolayers and water were fixed. Bonds to hydrogen atoms were constrained using the LINCS28 and SETTLE29 algorithms in the simulations A, B, C27.1, C27.2,

1 RMSD(τ ) = NL 1020

NL

∑ ⟨RMSDi(t , t + τ)⟩t i=1 DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

was performed for each simulation separately (results presented in Figures 3−6 and S2−S14). Second, PCA was performed on

Figure 3. Principal component analysis of the DOPC molecule conformations. (top) Eigenvalues of the covariance matrix. (bottom) Cumulative sum of the eigenvalues, normalized to 1. Four main components account for ∼50% of the structural variation, and the first 14 components account for more than 90%.

Figure 1. Structure of the DOPC molecule. (a) Chemical structure of DOPC (1,2-dioleoyl-sn-glycero-3-phosphocholine). (b) Sample of DOPC conformations observed in the molecular dynamics simulation. 200 structure snapshots from the simulation C27.1 are shown. The structures are aligned to the averaged structure. (c) Average positions of DOPC heavy atoms in each simulation relative to those in other simulations.

the sample containing the structures observed in all of the simulations A, B, C27.1, C36, G43, G54, L, and S (results presented in Figures 7−9 and S15−S16). Collectivity κ of the principal components37,38 was calculated as follows N

κ=

F 1 exp(−∑ αri2 log(αri2)) NF i=1

where NF is the number of the DOPC heavy atom degrees of freedom (3 degrees per each of the 54 heavy atoms), α is the normalization factor chosen so that ΣNi=1F αr2i = 1, and ri is the i-th component of the principal component displacement vector r = {r1,...,ri,...,rNF}. κ gives the effective number of nonzero displacements and is confined to the interval between 1/NF and 1. When the principal component is maximally collective and all degrees of freedom are equally involved in the conformational change, κ = 1. Conversely, when only one degree of freedom is involved, κ = 1/NF. Other types of motions result in intermediate values of κ. Root mean square inner product39 (RMSIP) was calculated as follows

Figure 2. Average root-mean-square deviations of DOPC heavy atom positions as a function of time lag. The structures were aligned prior to RMSD calculation. The conformational changes follow a power law, with RMSD ∼ time0.2, and then plateau at the RMSD value of ∼4.5 Å. The scales on both axes are logarithmic.

where NL is the number of the lipid molecules present in the system, and ⟨RMSDi(t,t + τ)⟩t is the root-mean-square deviation of the i-th molecule heavy atoms positions at the time moments t and t + τ, averaged over the time t. PCA5 was performed on DOPC heavy atoms positions using the g_covar and g_anaeig programs from the GROMACS software package.26 g_covar is used to calculate the atomic displacement covariance matrices and their eigenvalues and eigenvectors, and g_anaeig is used to project the molecular dynamics trajectories onto the obtained eigenvector bases. Overall, two types of analyses have been performed. First, PCA

RMSIP(n , i , j) =

1 n

n

n

∑ ∑ (r ikr lj)2 k=1 l=1

where i and j are the compared simulations, and rik and rjl are kth and l-th principal components of the simulations i and j correspondingly. Pearson correlation coefficients of covariance matrices Mikl and Mjkl of simulations i and j, correspondingly, were calculated as follows 1021

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation rij =

cov(Mkli , Mklj)

The eigenvalue-weighted geometric means of the characteristic time scale ratios in the simulations i and j, measured in common PC basis, were calculated as follows

σMkli ·σMklj

where cov(Mikl, Mjkl) is the covariance of the elements of Mikl and Mjkl, and σMikl and σMjkl are the standard deviations of the elements of Mikl and Mjkl. The angles φ and ψ that characterize the lipid molecule orientation relative to the membrane normal were calculated as follows. First, for each molecule at each time step, the average structure for this simulation was aligned to the molecule. Second, the principal axes of the average structure n1, n2, and n3 at this position were calculated so that n1 pointed roughly in the direction from the hydrophobic chain ends to the phosphatidylcholine moiety, and n2 pointed from the sn-1 chain to the sn-2 chain. Then, the angles φ and ψ were calculated as follows:

⎛ NF τi ⎞ R(i , j) = exp⎜⎜∑ βEk log kj ⎟⎟ τk ⎠ ⎝k=1

where NF is the number of the DOPC heavy atom degrees of freedom (3 degrees per each of the 54 heavy atoms), Ek is the k-th eigenvalue of the covariance matrix, β is the normalization factor chosen so that ΣNk=1F βEk = 1, and τik and τjk are the characteristic time scales of the distribution convergence or autocorrelation decay for the k-th PC in the simulations with the force fields i and j. Defined this way, the ratios possess the desirable properties:

φ = arccos(n1·z) sin(ψ ) = n 2 ·

R ( i , j) =

z × n1 z ·n1

3. RESULTS Simulations of DOPC Bilayer. We began the study with performing several simulations of DOPC bilayers utilizing eight different force fields (all-atom generalized AMBER (GAFF),6 united-atom Berger, 7 , 8 all-atom CHARMM27 9 and CHARMM36, 14 united-atom GROMOS 43A1-S315 and GROMOS 54A7,16 and all-atom Lipid1417 and Slipids18), which are consequently analyzed and compared. Parameters of the simulations were chosen to match the original ones as much as possible. The simulation times were chosen so as to obtain roughly the same amount of statistics: 500 ns for the systems A and C27.1 (72 lipid molecules in the system), 300 ns for the systems B, C36, G43, G54, and S (128 lipid molecules), 2 × 125 ns for the system L (128 lipid molecules), and 125 ns for the system C27.2 (288 lipid molecules). Consequently, the total lengths of the accumulated trajectories range from 32 to 38.4 μs for each of the simulations. Details of the simulations are presented in Table 1. Chemical structure, observed conformations, and averaged atomic coordinates of DOPC in the performed simulations are presented in Figure 1. Observed areas per lipid are presented in Figure S1. Behavior of DOPC in such simulations has already been described using traditional approaches, and the electron density profiles, deuterium order parameters, diffusion coefficients, and electrostatic potentials in the systems have been calculated.4,13,15−18 Consequently, we can focus on analyzing these simulations using the principal component analysis (PCA). While notable differences in the characteristic time scales of the conformational changes, occurring in these simulations, could already be inferred from analyzing the divergence of the structures as a function of time (Figure 2), the analyses presented below produce a more detailed picture. PCA of DOPC Conformational Changes. The principal component analysis is performed on the aligned structures (such as in Figure 1b). Similarly to the case of macromolecules, the analysis reveals that the covariance matrix is dominated by a few large-amplitude motions, with the first 12 to 14 PCs accounting for more than 90% of the structure variation (Figure 3). While the differences in the eigenvalue distributions may seem small, they are not negligible when compared with the differences between the parts of the same trajectory or between a smaller and a bigger system simulated with the same force

⎞ ⎛ z × n1 cos(ψ ) = n 2 ·⎜ × n1⎟ ⎠ ⎝ z ·n1 ⎧ arccos(cos(ψ )), sin(ψ ) ≥ 0 ψ=⎨ ⎩−arccos(cos(ψ )), sin(ψ ) < 0 ⎪ ⎪

As a measure of difference between two distributions Pi(x) and Pj(x) that have corresponding cumulative distribution functions Fi(x) and Fj(x), Kolmogorov−Smirnov statistics40 was used that is defined as follows: KSS(Fi , Fj) = sup|Fi(x) − Fj(x)| x

To test the convergence of the observed distributions, the distributions Pi,t,t+τ(x), calculated from the conformations assumed by lipid i between the time moments t and t + τ, were compared to the distribution obtained from the whole trajectory and all lipids, P0(x). Values KSS(τ), presented in Figures S7 and S12, were calculated as follows KSS(τ ) =

1 NL

NL

∑ ⟨KSS(Pi ,t ,t + τ , P0)⟩t i=1

when P0(x) is Gaussian, and τ → 0, KSS(τ) → 0.75 (see the Supporting Information). Autocorrelation R of the projection on PC is defined as follows R (τ ) =

R (i , m ) 1 and R(i , j) = R (j , m ) R (j , i )

⟨(p(t ) − ⟨p(t )⟩t )(p(t + τ ) − ⟨p(t + τ )⟩t )⟩t ⟨(p(t ) − ⟨p(t )⟩t )2 ⟩t

where p(t) is the projection value at the time moment t. As both KSS and autocorrelation decay could not be adequately fitted with power or exponential function, we calculated characteristic times τ1 and τ2 needed for KSS and autocorrelation to decrease in e and e2 times from the starting value, where e is the natural logarithm base. For simplicity, the starting values were assumed to be 0.75 and 1.0 for KSS and autocorrelation, correspondingly. The time dependence values were linearly extrapolated where needed. 1022

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

Figure 4. Conformational changes, associated with the top 10 principal components in each simulation (side and top views). Structures are colored from red to blue according to the PC projection value.

field (Figure S2). In the latter cases, the eigenvalues are virtually indistinguishable (Figure S2). In all of the simulations the first, largest-amplitude principal component (PC1) corresponds to the scissoring motion of the hydrophobic tails of the lipid molecule (Figure 4). The others, lower-amplitude components, are less collective (Figure S3) and account for various bending and twisting motions of lipid molecules (Figure 4). Consequently, they are affected stronger by choice of force field. Comparison of the principal component vectors reveals less similarity between the analyzed systems (Figures 5 and S4). While the first principal components of the simulations are almost identical, the others are less so. Correspondingly, for most of the pairwise comparisons the root-mean-square inner product (RMSIP), a measure of the conformational subspace overlap, shows a dip reaching 0.9−0.7 for the subspaces defined by the first two to four PCs (Figures S5). And once again, these differences are much more pronounced than the differences between the two halves of the simulation C27.1 or between the simulations C27.1 and C27.2 (Figures S6).

Figure 5. Example of the comparison of PCA eigenvectors obtained from different simulations (Berger and CHARMM36 in this case). Dot product matrices of the corresponding eigenvector sets are shown. The matrices have roughly diagonal structures, highlighting the conservation of the conformational space features.

After the principal components have been determined, it is informative to analyze the distributions of the projections of the simulated trajectories on the PCs. It appears that for all the PCs 1023

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

those presented in Figure 1c) that was aligned to the particular one. φ is defined as the angle between the molecule’s largest principal axis n1 and the normal to the bilayer z. ψ is defined as the angle between the molecule’s second largest principal axis n2 and the plane of n1 and z (see Methods for exact definitions). Solid angle-normalized distribution of φ has a maximum at 0°, very close to normal, and has a half-width of ∼22° (Figure S8a). ψ is distributed unevenly, with the probability density at ψ from −60° to 0° being almost twice the value at ψ above 120° or below −120° (Figure S8b). The average structure of the DOPC molecule is somewhat dependent on the orientation relative to the membrane normal (Figure S9), but this dependency is not very prominent in the distributions of the PC projections (Figure S10). Characteristic Time Scales. While the statistics obtained from 32 to 38.4 μs simulations of DOPC, which consists of 54 heavy atoms, is excessive, the analysis shows that equilibration of each individual molecule is relatively slow compared to what could be expected. We used two approaches to examine the equilibration: first, we characterized the convergence of PC projection distributions, and then we analyzed the characteristic autocorrelation decay times. To quantify the convergence of the PC projection distributions, we compared the distributions, obtained from the conformations of one lipid molecule over a specific time period, to reference distributions, obtained from the whole 32 to 38.4 μs sets of data. The distributions were compared using the Kolmogorov−Smirnov test40 that corresponds to the largest difference between the corresponding cumulative distribution functions (see definition and examples in Methods and Figure S11). Time dependence of the K−S statistics could not be adequately fitted neither with power nor with single exponent functions. Therefore, we calculated characteristic times that it takes for the statistics to decay in e and e2 times (with e being the base of natural logarithms). These characteristic times τ1 and τ2 for all the PCs, spanning several orders of magnitude, are presented in Figure S12. Similarly to the case of convergence of PC projection distributions, the time dependencies of autocorrelation values could not be fitted neither with power nor with exponential functions. The characteristic e and e2 autocorrelation decay times follow the same patterns as the distribution convergence times (Figure S13) and are clearly related to them (Figure S14). Overall, the obtained results are in line with the structure

besides PC1 (and PC4 in the simulation G54) the distributions are close to normal, with the PC1 distributions being constrained on one side (see Figure S7 for the first 5 PCs of each simulation). This anomaly in the distributions of the PC1 projections might be a consequence of the fact that PC1 is directly related to the distance between the two lipid chains, which cannot be smaller than the respective van der Waals radii. This will be explored in more detail below. Finally, the conformational ensembles obtained in different simulations can be compared directly by calculating the Pearson correlations of the covariance matrix elements. Pairwise correlations are presented in Figure 6.

Figure 6. Pearson correlation coefficients between the covariance matrices of the respective simulations. The force fields are grouped by similarity. The correlation coefficients are color-coded to highlight the similarity and dissimilarity between the force fields.

Dependence on the Lipid Molecule Orientation. In the analyses presented throughout this work the conformations of lipid molecules are studied irrespective of the orientations relative to the membrane plane. Therefore, it would be interesting to determine the degree of the dependence of the conformational ensembles on the spatial position of the molecules. We measured the orientation of each molecule at each time step by calculating the angles φ and ψ. For simplicity, both angles were determined for an averaged structure (such as

Figure 7. Comparison of the PC projection distributions in the same PCA eigenvector basis. Distributions for the PCs 1−5 are shown. Kolmogorov−Smirnov statistics for pairwise comparisons of all PCs are presented in Figure S11. 1024

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

Figure 8. Characteristic distribution convergence times (left) and autocorrelation decay times (right) for all the simulations in the joint PC basis.

divergence analysis (Figure 2) and show that the convergence in the system A is much slower than in other systems. Interestingly, analysis of these characteristic time scales revealed anomalous behavior of the projections on the principal component 17 of the simulation C27.1. Initially, the autocorrelation for this PC decays as expected, i.e. faster than the autocorrelations of the larger-amplitude PCs. However, for the larger times of several hundred nanoseconds, the autocorrelation value of PC17 becomes the highest of all (Figure S13). Visual inspection of the corresponding conformational changes revealed that this PC corresponds to slow isomerization of the bonds in the glycerol moiety, which has already been noted for the CHARMM force field. 41 Comparison of the distributions of the dihedral atoms and isomerization frequencies in the glycerol backbone revealed notable differences between the simulations; in the simulation C27.1 the isomerization rates in the glycerol moiety are extremely slow. PCA in a Common PC Basis. While the quantities obtained in the previous analyses are instructive, they cannot be compared directly to each other, as the principal components in each simulation were determined independently. PCA of the conformations in the concatenated trajectory produces PCs that describe the conformations in all of the simulations at once. As in the individual PC analyses, the PC1 distribution is the only one that is clearly non-Gaussian. There are three distinct peaks, whose amplitudes are not conserved in the eight simulations (Figure 7). Comparison of the PC1 projection value with respective distances between the carbons in the middle and at the ends of the DOPC oleoyl chains sheds light on this phenomenon (Figure S15). It appears that, together with PC1 projections, carbon−carbon distances also center around three values: ∼5, ∼10, and ∼15 Å, that are close to multiples of 4.6 Å, the characteristic interchain distance in the fluidic lipid phase.42,43 Thus, the three PC1 projection peaks correspond to the following conformations: with the hydrophobic chains in contact; separated by one other chain; separated by two other chains. The latter peak is more diffuse, as the chain positions deviate more and more from ideal packing. The time scales of the conformational changes from different simulations can also be determined (Figure S16) and are similar to those obtained in individual analyses (Figures S12 and S13). Since now the PCs are common for all of the simulations, the time scales can be compared directly (Figure 8). To determine the ratios of the observed time scales of conformational changes as a single number for each pair of the simulations, we

calculated the weighted geometric means of corresponding time scales (Figure 9, Methods).

4. DISCUSSION In the previous section, application of PCA to study conformations of lipid molecules has been described, and the results of the analysis and comparison of simulations of DOPC with different FFs have been presented. The analysis began with alignment and averaging of the structures. Interestingly, while both hydrophobic chains of DOPC have double bonds (Figure 1a) that have to induce kinks, no such perturbations are visible in the averaged structures (Figure 1c). However, if only the molecules with a specific orientation relative to the bilayer normal are considered, the kinks can be discerned (Figure S9). The averaged structure from the simulation B is the widest of all, with those from G54 and G43 being second and third widest. Interestingly, the observed areas per lipid in these simulations are smaller than in others (Figure S1). PCA of DOPC Conformations. The results of the PCA show that for all of the FFs, 12 to 14 PCs are enough to describe 90% of structural variation of DOPC, despite it having 156 internal heavy-atom degrees of freedom. The principal components are similar between the force fields, although sometimes either the ordering is changed (for example, PC5 of the simulation B is more similar to PC4 of the simulation C36 than to PC5, Figure 5) or the components are mixed (PCs 8 and 9 of the simulation B are similar to PCs 8 and 9 of the simulation C36, Figure 5). The overlap of the conformational subspaces described by the first 20 PCs is more than 95% in all of the pairwise comparisons (Figure S5). Analysis of the distributions of the structure projections on different PCs reveals that there are no clear energetic barrierseparated substates: the distributions are almost-Gaussian for all the PCs except PC1. For PC1, the distributions reveal three diffuse peaks, corresponding to three possibilities of the relative position of the DOPC molecule hydrophobic chains: in contact; with one other chain in between; and with two other chains in between (Figures 7 and S15). Similarly to the case of average structures, distributions of the PC1 projections from the simulations B and G54 clearly stand out (Figure 7), and the conformations where the chains are in direct contact are almost absent there (Figure S15). Perhaps, this is a consequence of higher probability to observe trans orientation of the sn-1 and sn-2 oxygen atoms of the glycerol moiety in these simulations (data not shown). Correlations of the Atomic Displacement Covariance Matrices. The pairwise comparisons of atomic displacement covariance matrix eigenvalues (Figure 3) and eigenvectors 1025

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

Figure 9. Comparison of the characteristic distribution convergence (left) and autocorrelation decay (right) time scales in the simulations with different force fields. Depicted is the geometric mean ratio of the time scales in the simulation with force field 1 to the time scales in the simulation with force field 2. The ratios are color-coded to highlight the similarity and dissimilarity between the force fields.

(Figures S4 and S5) provide a lot of information about the simulations being compared. However, they are difficult to summarize, especially in the present case of eight simulations, since there are 28 different pairs to be analyzed. Moreover, comparison of the eigenvectors has significant limitations: 1) in the case of similar eigenvalues, the ordering of the eigenvectors is not reliable; 2) in the case of nonlinear motions, the choice of eigenvectors is sometimes ambiguous; 3) it is not clear how to compare the distributions of the structure projections on different PC bases (Figure S7); 4) it is not clear what is the significance of similarity or dissimilarity of two eigenvectors from two simulations or similarity or dissimilarity of the projection distributions on these two vectors. All of these limitations can be overcome if one compares not the covariance matrix eigenvectors and projection distributions but directly the covariance matrices, by calculating the Pearson productmoment correlation coefficient (PCC, also known as Pearson’s r). While the PCCs of the atomic displacement covariance matrices are more than 0.9 for each pair of the simulations (the conformations produced by each force field are reasonable and similar), the force fields can still be grouped into clusters (Figure 6). From such analysis, it can be seen that there is a notable convergence in the atomic displacement distributions produced by the newer force fields (CHARMM36, Lipid14, Slipids). Somewhat expectedly, there is a stronger correlation between the related force fields. For example, GAFF is similar to Lipid14 but less similar to CHARMM36, Slipids, or other force fields. CHARMM27 is similar to CHARMM36 but much less similar to Lipid14 and Slipids. Finally, the united-atom force fields Berger, GROMOS 43A1-S3, and GROMOS 54A7 are in a cluster of their own (Figure 6). Dependence on the Lipid Molecule Orientation. Most of the analyses presented in this work were performed irrespective of the DOPC molecules’ orientations. To check the dependence of the DOPC molecule conformation on its orientation in the membrane, we calculated the distributions of various orientations in the simulation C36. The distributions in other recent force fields (Lipid14 and Slipids) are expected to be very similar, since the force fields themselves are similar to CHARMM36 (Figures 6 and 8). Interestingly, the analysis shows that the average structures (Figure S9) and PC

projection distributions (Figure S10) change slightly in the ensembles of molecules with very different orientations. However, it should be noted that the PC projection distributions obtained for this 38.4 μs sample are still somewhat noisy (Figure S10), and consequently an additional simulation and analysis would be required to obtain definitive results. Equilibration Time Scales of Lipid Molecules. Numerical description of DOPC conformations allows direct measurement of the equilibration of various collective motions. Two approaches to this can be envisioned: convergence of the distributions of projections on PCs can be calculated, and the autocorrelation decay rates can be measured. While the characteristic autocorrelation decay times are simpler to calculate, we believe that the characteristic distribution convergence times provide a better guidance for choosing the simulation length, as correct determination of thermodynamic values depends on thorough sampling of conformational space. Also, our results show that while the time scales determined through these two approaches are clearly related (Figure S14), the autocorrelation decay times underestimate the distribution convergence times by a factor of ∼3.5. Using the weighted geometrics means, we calculated the ratios of the characteristic relaxation time scales for all of the analyzed simulations (Figures 8 and 9). As in the case of the structure divergence analysis (Figure 2), Slipids appears to be the fastest-equilibrating force field, while in GAFF the equilibration is an order of magnitude slower. The reasons for this are not entirely clear. The united-atom force fields Berger, GROMOS43A1-S3, and GROMOS54A7 are generally ∼1.5 to ∼4 times slower than the recent all-atom force fields. Overall, the relations between the relaxation time scales in different simulations (Figures 8 and 9) follow the relations between the conformational ensembles (Figure 2) and show that while there were significant differences between the older force fields, the newer ones (CHARMM36, Lipid14, Slipids) produce the results that are much more similar. Even in the simulation S, where relaxation is the fastest (Figures 8 and 9), the major PCs equilibrate on the time scale of hundreds of nanoseconds; even after 100 ns the distributions do not completely converge (Figure S12). This fact has an important consequence. If one studies interactions of a 1026

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation

Foundation research project 14-14-00995 and the 5top100program of the Ministry for science and education of Russia. The work was also supported by FRISBI (ANR-10-INSB-0502) and GRAL (ANR-10-LABX-49-01) grants within the Grenoble Partnership for Structural Biology (PSB).

molecule of interest with a bilayer, then most probably the molecule interacts with specific lipids and not with the whole bilayer. Therefore, convergence of the simulations cannot be assumed before these specific lipids, the individual constituents of the system, have equilibrated, and as the lipid molecule in the pure bilayer takes more than 100 ns for relaxation, so can the lipid molecule in a complex system. This time scale is not unexpected, as already in 2006 Grossfield et al.44 noted that in 26 trajectories of 100 ns, the sampling for a membrane protein embedded in a lipid bilayer has not converged even for parts of the system. The later studies advise at least 1 μs-long simulations for sufficient sampling in modeling of membrane proteins.1



ABBREVIATIONS DOPC, 1,2-dioleoyl-sn-glycero-3-phosphocholine; FF, force field; GAFF, general AMBER force field; K−S, Kolmogorov− Smirnov; KSS, Kolmogorov−Smirnov statistics; PC, principal component; PCA, principal component analysis; RMSD, rootmean-square deviation; RMSIP, root-mean-square inner product



5. CONCLUSIONS In this work, we presented application of PCA to study the conformations of an individual lipid molecule. The approach results in comprehensive quantitative characterization of the molecule’s conformational space, elucidation of major degrees of freedom, and characteristic time scales of the motions in molecular dynamics simulations. Also, it provides a simple framework for comparison of different simulations of systems with lipids, which we illustrated on simulations with eight popular lipid force fields. The analysis revealed that the results obtained with newer force fields are similar. At the moment, it is not clear how the smaller differences that remain can be resolved with the help of experiment, but perhaps more extensive ab initio simulations could provide the consensus parameters. We expect that the approach will be useful for studying the effects of additives (ranging from small molecules to other lipids and membrane proteins) on the structure and dynamics of lipids and will help with further development of lipid FFs.



(1) Lindahl, E.; Sansom, M. S. P. Membrane Proteins: Molecular Dynamics Simulations. Curr. Opin. Struct. Biol. 2008, 18, 425−431. (2) Marrink, S. J.; de Vries, A. H.; Tieleman, D. P. Lipids on the Move: Simulations of Membrane Pores, Domains, Stalks and Curves. Biochim. Biophys. Acta, Biomembr. 2009, 1788, 149−168. (3) Bennett, W. F. D.; Tieleman, D. P. Computer Simulations of Lipid Membrane Domains. Biochim. Biophys. Acta, Biomembr. 2013, 1828, 1765−1776. (4) Klauda, J. B.; Venable, R. M.; MacKerell, A. D., Jr.; Pastor, R. W. Chapter 1 Considerations for Lipid Force Field Development. In Current Topics in Membranes; Feller, S. E., Ed.; Computational Modeling of Membrane Bilayers; Academic Press: 2008; Vol. 60, pp 1−48. (5) David, C. C.; Jacobs, D. J. Principal Component Analysis: A Method for Determining the Essential Dynamics of Proteins. Methods Mol. Biol. 2014, 1084, 193−226. (6) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157−1174. (7) Berger, O.; Edholm, O.; Jähnig, F. Molecular Dynamics Simulations of a Fluid Bilayer of Dipalmitoylphosphatidylcholine at Full Hydration, Constant Pressure, and Constant Temperature. Biophys. J. 1997, 72, 2002−2013. (8) van Gunsteren, W. F.; Berendsen, H. J. C. Groningen Molecular Simulation (GROMOS) Library Manual; University of Groningen: Groningen, 1987. (9) Feller, S. E.; MacKerell, A. D. An Improved Empirical Potential Energy Function for Molecular Simulations of Phospholipids. J. Phys. Chem. B 2000, 104, 7510−7515. (10) Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. The Missing Term in Effective Pair Potentials. J. Phys. Chem. 1987, 91, 6269−6271. (11) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; Hermans, J. Interaction Models for Water in Relation to Protein Hydration. In Intermolecular Forces; Pullman, B., Ed.; The Jerusalem Symposia on Quantum Chemistry and Biochemistry; Springer: Netherlands, 1981; pp 331−342. (12) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926−935. (13) Siu, S. W. I.; Vácha, R.; Jungwirth, P.; Böckmann, R. A. Biomolecular Simulations of Membranes: Physical Properties from Different Force Fields. J. Chem. Phys. 2008, 128, 125103. (14) Klauda, J. B.; Venable, R. M.; Freites, J. A.; O’Connor, J. W.; Tobias, D. J.; Mondragon-Ramirez, C.; Vorobyov, I.; MacKerell, A. D.; Pastor, R. W. Update of the CHARMM All-Atom Additive Force Field for Lipids: Validation on Six Lipid Types. J. Phys. Chem. B 2010, 114, 7830−7843. (15) Chiu, S.-W.; Pandit, S. A.; Scott, H. L.; Jakobsson, E. An Improved United Atom Force Field for Simulation of Mixed Lipid Bilayers. J. Phys. Chem. B 2009, 113, 2748−2763.

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.5b01106. Derivation of KSS(τ) when P0(x) is Gaussian and τ approaches 0, table with the properties available for analysis and comparison when using the PCA approach, and Figures S1−S16 (PDF)



REFERENCES

AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. Author Contributions

P.B. performed the research, analyzed the results, and helped with manuscript preparation, V.G. and S.G. helped with manuscript preparation, and I.G. designed and supervised the research, analyzed the results, and prepared the manuscript. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The molecular dynamics simulations were performed using the JUROPA supercomputer at the Research Center Jülich. We are grateful to Liudmila Khakimova and Charles H. Robert for fruitful discussions. The work was supported by the CEA(IBS) − HGF(FZJ) STC 5.1 specific agreement, Russian Science 1027

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028

Article

Journal of Chemical Theory and Computation (16) Poger, D.; Van Gunsteren, W. F.; Mark, A. E. A New Force Field for Simulating Phosphatidylcholine Bilayers. J. Comput. Chem. 2010, 31, 1117−1125. (17) Dickson, C. J.; Madej, B. D.; Skjevik, Å. A.; Betz, R. M.; Teigen, K.; Gould, I. R.; Walker, R. C. Lipid14: The Amber Lipid Force Field. J. Chem. Theory Comput. 2014, 10, 865−879. (18) Jämbeck, J. P. M.; Lyubartsev, A. P. Derivation and Systematic Validation of a Refined All-Atom Force Field for Phosphatidylcholine Lipids. J. Phys. Chem. B 2012, 116, 3164−3179. (19) Durell, S. R.; Brooks, B. R.; Ben-Naim, A. Solvent-Induced Forces between Two Hydrophilic Groups. J. Phys. Chem. 1994, 98, 2198−2202. (20) Neria, E.; Fischer, S.; Karplus, M. Simulation of Activation Free Energies in Molecular Systems. J. Chem. Phys. 1996, 105, 1902−1921. (21) Domański, J.; Stansfeld, P. J.; Sansom, M. S. P.; Beckstein, O. Lipidbook: A Public Repository for Force-Field Parameters Used in Membrane Simulations. J. Membr. Biol. 2010, 236, 255−258. (22) Feller, S. E.; Yin, D.; Pastor, R. W.; MacKerell, A. D. Molecular Dynamics Simulation of Unsaturated Lipid Bilayers at Low Hydration: Parameterization and Comparison with Diffraction Studies. Biophys. J. 1997, 73, 2269−2279. (23) Böckmann, R. A.; Hac, A.; Heimburg, T.; Grubmüller, H. Effect of Sodium Chloride on a Lipid Bilayer. Biophys. J. 2003, 85, 1647− 1655. (24) Jo, S.; Kim, T.; Iyer, V. G.; Im, W. CHARMM-GUI: A WebBased Graphical User Interface for CHARMM. J. Comput. Chem. 2008, 29, 1859−1865. (25) Wu, E. L.; Cheng, X.; Jo, S.; Rui, H.; Song, K. C.; DávilaContreras, E. M.; Qi, Y.; Lee, J.; Monje-Galvan, V.; Venable, R. M.; et al. CHARMM-GUI Membrane Builder toward Realistic Biological Membrane Simulations. J. Comput. Chem. 2014, 35, 1997−2004. (26) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 435−447. (27) Case, D. A.; Babin, V.; Berryman, J.; Betz, R. M.; Cai, Q.; Cerutti, D. S.; Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; et al. Amber 14; University of California: San Francisco, 2014. (28) Hess, B.; Bekker, H.; Berendsen, H.; Fraaije, J. LINCS: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18, 1463−1472. (29) Miyamoto, S.; Kollman, P. Settle: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952−962. (30) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of N-Alkanes. J. Comput. Phys. 1977, 23, 327−341. (31) Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126, 014101. (32) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; DiNola, A.; Haak, J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684. (33) Hoover, W. G. Canonical Dynamics: Equilibrium Phase-Space Distributions. Phys. Rev. A: At., Mol., Opt. Phys. 1985, 31, 1695−1697. (34) Nosé, S. A Unified Formulation of the Constant Temperature Molecular Dynamics Methods. J. Chem. Phys. 1984, 81, 511−519. (35) Parrinello, M.; Rahman, A. Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method. J. Appl. Phys. 1981, 52, 7182−7190. (36) Pastor, R. W.; Brooks, B. R.; Szabo, A. An Analysis of the Accuracy of Langevin and Molecular Dynamics Algorithms. Mol. Phys. 1988, 65, 1409−1419. (37) Brüschweiler, R. Collective Protein Dynamics and Nuclear Spin Relaxation. J. Chem. Phys. 1995, 102, 3396−3403. (38) Tama, F.; Sanejouand, Y.-H. Conformational Change of Proteins Arising from Normal Mode Calculations. Protein Eng., Des. Sel. 2001, 14, 1−6. (39) Amadei, A.; Ceruso, M. A.; Di Nola, A. On the Convergence of the Conformational Coordinates Basis Set Obtained by the Essential

Dynamics Analysis of Proteins’ Molecular Dynamics Simulations. Proteins: Struct., Funct., Genet. 1999, 36, 419−424. (40) Massey, F. J. The Kolmogorov-Smirnov Test for Goodness of Fit. J. Am. Stat. Assoc. 1951, 46, 68−78. (41) Vogel, A.; Feller, S. E. Headgroup Conformations of Phospholipids from Molecular Dynamics Simulation: Sampling Challenges and Comparison to Experiment. J. Membr. Biol. 2012, 245, 23−28. (42) Levine, Y. K.; Wilkins, M. H. F. Structure of Oriented Lipid Bilayers. Nature 1971, 230, 69−72. (43) Wilkins, M. H. F.; Blaurock, A. E.; Engelman, D. M. Bilayer Structure in Membranes. Nature 1971, 230, 72−76. (44) Grossfield, A.; Feller, S. E.; Pitman, M. C. Convergence of Molecular Dynamics Simulations of Membrane Proteins. Proteins: Struct., Funct., Genet. 2007, 67, 31−40.

1028

DOI: 10.1021/acs.jctc.5b01106 J. Chem. Theory Comput. 2016, 12, 1019−1028