Comparison of Implicit and Explicit Solvation Models for Iota

Mar 8, 2017 - Structural and Computational Biology Research Group, Department of .... trajectories using implicit and explicit water solvation systems...
0 downloads 0 Views 1MB Size
Subscriber access provided by University of Newcastle, Australia

Article

A Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics Wasinee Khuntawee, Manaschai Kunaseth, Chompoonut Rungnim, Suradej Intagorn, Peter Wolschann, Nawee Kungwan, Thanyada Rungrotmongkol, and Supot Hannongbua J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.6b00595 • Publication Date (Web): 08 Mar 2017 Downloaded from http://pubs.acs.org on March 10, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics Wasinee Khuntawee1†, Manaschai Kunaseth2*, Chompoonut Rungnim2, Suradej Intagorn3, Peter Wolschann4,5,6, Nawee Kungwan7, Thanyada Rungrotmongkol8,9,10*, Supot Hannongbua6 1

Nanoscience and Technology Program, Graduate School, Chulalongkorn University, 254 Phayathai Road, Bangkok 10330, Thailand

2

National Nanotechnology Center (NANOTEC), National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand

3

Department of Mathematics, Statistics and Computer Science, Kasetsart University, Kamphaeng Saen Campus, Nakhon Pathom 73140, Thailand 4

Department of Pharmaceutical Technology and Biopharmaceutics, University of Vienna, Vienna 1090, Austria 5

Institute of Theoretical Chemistry, University of Vienna, Vienna 1090, Austria

ACS Paragon Plus Environment

1

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6

Page 2 of 29

Computational Chemistry Unit Cell, Department of Chemistry, Faculty of Science, Chulalongkorn University, 254 Phayathai Road, Bangkok 10330, Thailand

7

Department of Chemistry, Faculty of Science, Chiang Mai University, 239 Huay Kaew Road, Muang District, Chiang Mai 50200, Thailand

8

Structural and Computational Biology Unit, Department of Biochemistry, Faculty of Science, Chulalongkorn University, 254 Phayathai Road, Bangkok 10330, Thailand 9

Ph.D. Program in Bioinformatics and Computational Biology, Faculty of Science,

Chulalongkorn University, 254 Phayathai Road, Patumwan, Bangkok 10330, Thailand 10

Molecular Sensory Science Center, Faculty of Science, Chulalongkorn University, 254 Phayathai Road, Patumwan, Bangkok 10330, Thailand

ABSTRACT. Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic linked D-glucopyranose units (CD14). The new quantifiable calculation methods are proposed to analyze the openness, bending, and twisted conformation of CD14 in terms of circularity, biplanar angle, and one-directional conformation (ODC). CD14 in GB implicit water model (Igb5) was found mostly in an opened conformation with average circularity of 0.39 ± 0.16 and a slight bend with average biplanar angle of 145.5 ± 16.0°. In contrast, CD14 in TIP3P explicit water solvation is significantly twisted with average circularity of 0.16 ± 0.10, while 29.1% are ODCs. In addition, classification of CD14 conformations using Gaussian mixture model (GMM) shows that 85.0% of all CD14 in implicit

ACS Paragon Plus Environment

2

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

water at 300 K correspond to the elliptical conformation, in contrast to the 82.3% in twisted form in explicit water. GMM clustering also reveals minority conformations of CD14 such as the 8shape, boat-form, and twisted conformations. This work provides fundamental insights into CD14 conformation, influence of solvation models, and also proposes new quantifiable analysis techniques for molecular conformation studies in the future.

1. Introduction Large ring cyclodextrins (LR-CDs) are cyclic oligosaccharide of 1-4 linked alpha-Dglucopyranoses consisting of more than eight glucose subunits. Synthesis of LR-CDs has been achieved for several sizes (up to hundreds of subunits) using enzymatic syntheses of amylomaltase, with 4α-glucanotransferase. The unique loop structure of amylomaltase does not exist in small ring cyclodextrins (SR-CDs) produced by cyclodextrin glycosyltransferase (CGTase). The LR-CDs are expected to be suitable host molecule for the guest ligands inclusion complexation, due to the high water solubility of LR-CDs1 and multiple cavities for ligands inclusion.2 However, only four crystal structures of δ-CD (CD9),3 ε-CD (CD10),4-6 ι-CD (CD14),5-7 and ϕ-CD (CD26)8-9 have been reported. Based on crystallographic studies, these LRCDs can form various conformers such as the bent form of CD10 and CD14, while the twisted form has been observed in CD26. In contrast, SR-CDs such as CD6, CD7 and CD8 are found only in the cyclic shape.10-12 Mixed sizes of CDs can be synthesized using polymerization of CGTase. Varying synthesis conditions including temperature and concentration of ethanol in an aqueous solution, can result in increased chain-length of LR-CDs.13-14 To investigate the structural properties of individual LR-CD, single-size LR-CDs that can be purification and have their crystal structures identified are required via a costly process with sophisticated laboratory analysis. Over the last decade, classical molecular dynamics (MD) method has been applied to

ACS Paragon Plus Environment

3

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 29

study the LR-CDs conformations in both gas and solution phases.2, 15 In this work the structural stability of LR-CDs has been discussed in terms of their shape, hydrogen bond formation, as well as their energy. The larger size rings of LR-CDs have a more flexible structure, where multiple cavities in the LR-CDs could be frequently observed. By using classical MD simulation of CD14 in solution at room temperature the deformed conformations from the crystal structure, such as the open form, twisted shape containing two loops and the dumbbell shape, were found.16-17 Moreover, the geometry of CD26 was studied by DFT in the gas phase, and was found to be in agreement with its crystal structure, which is solvated in water.18 Although classical MD is a useful method to study the conformation of LR-CDs, it requires long-time dynamics in order to overcome the energy barrier among local minima. Replica exchange molecular dynamics (REMD) simulation is a potentially powerful technique in the context of conformational exploration.19-22 REMD methods can be used to efficiently solve multiple minima problem using independent MD simulations (i.e. replica), in the case of temperature REMD each replica has a different temperature which can exchange between replicas, in this work when referring to REMD we are referring to temperature REMD. Nevertheless, a significant challenge for conformational searching with REMD is the vast amount of conformation data generated from the multiple REMD trajectories. One way to deal with this issue is to visualize each MD trajectory and manually choose “important” snapshots based on visual data and researcher’s experience. This method might be possible and effective for small sets of trajectories, however capturing all of the key snapshots from large sets of trajectories is highly challenging and may introduce sampling bias. In addition, trajectories generated from REMD are subjected to spontaneous and erratic motion due to the temperature/trajectory exchanges between different replicas, which might hamper visual

ACS Paragon Plus Environment

4

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

information of the conformations. Therefore, designing systematic routines to quantify structural properties is crucial for analyzing the large amount of data generated from REMD simulation. Recently, machine learning and data analytic techniques have become immensely useful for computational studies.23-25 Unsupervised learning techniques such as clustering can be applied on large amount of data to discover hidden classes of data.26-27 Clustering algorithm based on the principal component analysis (PCA) technique is commonly used to reduce high dimensional data into lower dimension but highly significant data, while methods such as Kmean clustering, DBSCAN, and Gaussian mixture models (GMM) are more suitable for lowdimensional data. GMM clustering is a method of interest due to its fast computation, its suitability for density estimation, and its non-bias toward zero means and cluster sizes. In this paper, REMD simulations have been performed to explore the conformational space of the iota-cyclodextrin containing 14 α-1,4 glycosidic linked D-glucopyranose units (CD14) structure. To elucidate the solvation effects on LR-CD conformation, we systematically compare and analyze CD14 trajectories using implicit and explicit water solvation systems. To efficiently analyze CD14 conformation from REMD results, we have combined two approaches: (1) development of new LR-CD structural characterizations; and (2) application of machine learning based clustering algorithm for classification of CD14 structures. In the first approach, we developed three new LR-CD characterizations including circularity, biplanar angle, and onedimensional conformation to quantitatively analyze CD14 conformations from REMD simulations both in implicit and explicit solvation systems. In-depth computational steps, detailed analysis, and performance evaluations of the new techniques are thoroughly discussed. In the second approach, the GMM clustering algorithm is applied to systematically identify the groups of CD14 based on their conformations. Finally, the new characterization techniques are

ACS Paragon Plus Environment

5

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 29

applied to the GMM clustering results, revealing the underlying groups in CD14 conformations. Our new bifurcated approach has provided a systematic way to analyze complex conformations of LR-CDs, which is beneficial for a broad-range of research areas. In addition, the direct comparison of CD14 conformations from REMD using implicit and explicit water solvation models should provide insights into the significance of solvation effect for LR-CDs conformation analysis.

2. Methodology 2.1 Replica exchange molecular dynamics (REMD) In the present work, we performed REMD simulations to study the conformation of CD14 using implicit and the explicit water solvation models. All system preparations and REMD simulations are performed using Amber10.28 The starting structure of CD14 is obtained from the Cambridge Crystallographic Data Centre entry code CCDC124917,5 which is a macroring consisting of 14 α-1,4 glycosidic linked D-glucopyranose units formed in elliptical and bent shape. To remove steric interactions, the initial structure of CD14 is minimized by steepest descent for 2000 steps, followed by a 1000 steps using the conjugate gradient method. The glycam06j force field was used for the CD14 molecule and the REMD simulation was performed using a time step of 1 fs. For REMD simulations of CD14 with implicit water solvation, the GB implicit model Igb5 was used (extended force field tests and the solvation model effect have been thoroughly discussed in our previous study).29 REMD simulations were performed using 16 replicas with temperature exchange within the range 300-600 K and a 20 K interval (see Table S1), yielding the acceptance ratio of temperature exchanges of more than 40%. Prior to the REMD runs, a

ACS Paragon Plus Environment

6

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

short MD simulation of 5 ns was performed in order to equilibrate the systems at the assigned temperatures. Afterwards, the REMD simulation was conducted for 100 ns. CD14 snapshots were taken every 1 ps from the REMD simulation trajectories at 300 K for conformation analysis, resulting in 100,000 snapshots of CD14 in total. Note that these REMD simulation conditions have been validated in our previous study.29 In the case of REMD simulations using explicit water solvation, the CD14 molecule was solvated by 2,617 TIP3P water molecules (total of 8,145 atoms in the system) in a 12.0 Å octagonal simulation box using periodic boundary conditions. Since the explicit water system contains many more atoms than the implicit counterpart, many more replicas are required in order to cover a similar temperature range to that used in implicit water REMD. To optimize the number of REMD replicas, a varying REMD temperature interval scheme was applied,30 resulting in 64 REMD replicas covering the temperatures range of 300-604 K (see the exact replica temperatures for explicit water REMD simulation in Table S2). With these conditions, the temperature exchange acceptance ratio of 35 - 50% was achieved. Similar to the implicit solvation simulation, each replica was conducted for 1 ns with an MD simulation at the assigned temperatures for equilibration prior to the REMD simulation. Then, an REMD simulation is performed for 20 ns, the CD14 structure was obtained every 1 ps (for a total of 20,000 snapshots) from the 300 K trajectory for conformation analysis. Note that all explicit TIP3P water molecules were excluded in the conformation analysis.

2.2 Structural analysis We proposed three molecular-structure analytical methods, which are circularity, biplanar angle, and one-dimensional conformation. The analysis codes were implemented using Python

ACS Paragon Plus Environment

7

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 29

2.7 incorporated with scikit-learn machine-learning toolkit 0.15.0,31 scipy version 0.14.0, and numpy 1.8.1 for numerical and mathematical supporting subroutines. 2.2.1. Circularity (γc2) The extended length of CD14 oligosaccharides makes its configuration possibly twisted and deformed. In order to quantify the degree of twist for CD14 configurations, circularity (γc) is defined as a ratio between major and minor diameters in a deformed elliptical configuration of CD14 (see Figure 1). Circularity is computed in the following steps. First, diameters of CD14 are calculated as a distance between two oxygen atoms at gyclocidic bond (O4) on the opposite side of the CD14 ring (i.e. for 14 oligosaccharides, O4 atom of i-th saccharide molecule and the O4 atom of the (i+7)-th saccharide molecule). Circularity of CD14 is defined as the maximum ratio between the shortest and the longest of CD14 diameter:     di γ c = max    di+ L  mod L    2 

(1)

where L denotes the number of sugar molecules present in the LR-CDs ring (i.e. L = 14 for CD14), di is the diameter between opposite O4 atoms, and mod denotes the modulo operation. Note that the value of γc is in a range of 0 < γc ≤ 1, when γc = 1 the LR-CD is in a perfectly circular shape. Circularity decreases as the structure deforms into elliptical shape.

ACS Paragon Plus Environment

8

Page 9 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Schematic diagram depicting the ratio between the maximum and minimum diameters of ellipse on a 2D plane. 2.2.2. Biplanar angle (θP) Previous studies reported that CD14 conformations are of a bent form

5-7

resulting in a

folded biplanar configuration. In this type of structure, the degree of bending between the two internal planes significantly affects the ability of LR-CD to include a drug molecule. Here, biplanar angle, θP, is proposed to characterize the bending of internal molecular planes within LR-CD structures, see Figure 2. This calculation method can be summarized in three steps: (I) Clustering of atoms in CD14 into two groups (X1 and X2) based on their coordinates. (II) Find the internal planes P1 and P2, which are the most spatially correlated 2dimensional plane of X1 and X2 embedded in 3-dimensional space. (III) Calculate θP from the angle between normal vector of planes P1 and P2. In step (I), atoms of the CD14 conformation are clustered into two groups using the kmean clustering algorithm.27 Let ri be a spatial coordinate of i-th atom in CD14.

K-mean

clustering algorithm decides whether ri belongs to X1 or X2 by minimizing the following objective function Z:

  Z * = argmin  ∑ dist ( c1, r′) + ∑ dist ( c 2 , r′)  X1,X 2  ∀r′∈X ∀r′∈X 2  1

(2)

where dist(a, b) is the Euclidian distance between two points, c1 and c2 denote the centroids of atoms in X1 and X2 calculated from: c i =

1 Xi



r′ . The mini-batch k-mean clustering algorithm

∀r′∈X i

is used due to its highly efficient computation time.31 The parameters of mini-batch k-mean

ACS Paragon Plus Environment

9

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 29

procedure are as follows: number of clusters (k) = 2, number of mini batches = 100, number of randomized initialization trials = 10, maximum clustering iterations = 300, and with a stop criteria which occurs when the smoothed variance-normalized mean-center squared position changes by less than 10-4. After the mini-batch k-mean process, two sets of atomic coordinates X1 and X2 are obtained. In step (II), the most spatially correlated 2-dimensional plane Pi for atoms in Xi is computed. Let Pi = (pˆ 1i , pˆ i2 ) , where Here,

pˆ 1i ⊥ pˆ i2 are orthonormal basis vectors spanning plane Pi.

pˆ 1i , pˆ i2 are obtained from PCA as the two-most correlated principal components of atomic

coordinate of all atoms in Xi. In order to find Pi, we performed a PCA on Xi. The two-most correlated orthonormal basis from PCA (i.e. the principal components 1 and 2) spanning Pi. Note that the third orthonormal basis is the normal vector ( nˆ i ) of plane Pi. Finally, step (III) is achieved by calculating angle between planes P1 and P2 based on the cosine of their normal vectors:

θ P = cos−1 (nˆ 1 ⋅ nˆ 2 )

(3)

For θP < 180°, two angles exist, which are θP and π-θP. Here, we assume that θP > 90° because the steric repulsion would prevent P1 and P2 becoming too close to each other. Thus, we considered the biplanar angle as θ P = max(θ P , π − θ P ) hereinafter. 2.2.3. Principal component analysis (PCA) on molecular coordinate and one-dimensional conformation (ODC) The twisted and folded conformations of CD14 lead to a stretched and directional distribution of CD14 atoms. Therefore, we employed the PCA on all atom positions in each CD14 molecule to analyze the degree of directional distribution of CD14 molecule in this form.

ACS Paragon Plus Environment

10

Page 11 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Here, three principal components from PCA (fPC1, fPC2 and fPC3), where fPC1 > fPC2 > fPC3 and fPC1 + fPC2 + fPC3 = 1.0) characterize the degree of atoms distribution in molecular coordinate space. In particular, a CD14 configuration in which fPC1 >>> (fPC2 + fPC3) represents highly distributed atoms in one direction. Thus, we defined a particular CD14 structure as one-dimensional conformation (ODC) if its fPC1 > 0.7. See example of ODC CD14 in Figure 2(b).

Figure 2. (a) Schematic diagram illustrates the internal plane Pi for each cluster of atoms. Biplanar angle is noted as an angle between two normal vectors ni. (b) Schematic diagram shows the principal component analysis (PCA) in molecular coordinate space of one-dimensional conformation (ODC) CD14.

2.3 Gaussian mixture models (GMMs) clustering algorithm The clustering algorithm is an unsupervised learning algorithm to classify datasets into multiple classes based on the input parameters. Gaussian mixture model (GMM) clustering is an N-class clustering algorithm based on a set of GMMs {λ1, λ2, …, λN}, where each GMM is a Ddimensional weighted sum of M-component normal densities. The mixture density of n-th GMM is defined as:

ACS Paragon Plus Environment

11

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

M r r p( x | λn ) = ∑ win pin ( x)

Page 12 of 29

(4)

i=1

where M is the number of GMM components, win denotes the mixture weights of i-th uni-model

r r normal density function pin ( x) for the n-th GMM λn, where pin ( x) is defined as a Ddimensional Gaussian function centralized at the D mean vector

r µi with a D × D covariance

matrix of:

r pin ( x) =

 1 r r r r  exp  − ( x − µin )T (Cin )−1 ( x − µin )  2  2π D Cin 1

(5)

Thus, a complete description of each GMM is defined as follows:

r λn = {win , µin ,Cin | i = 1,..., M }

(6)

Note that GMM clustering utilizes an expectation–maximization algorithm32 to determine the most suitable N classes for the dataset. In this work, the GMM clustering algorithm is applied to classify CD14 conformation both in the implicit and explicit water solvation model. Here, GMM is applied on a set of 21dimension features held in vector Y = [d1,…,d7, α1,…,α14] for each CD14 structure. The first seven elements, di, are the diameters. Since CD14 is cyclic, rotational-invariant ordering of diameter should be considered for every snapshot. This is addressed by defining d1 as the longest diameter and the order from d2 to d7 is cycled in either the clockwise or counter-clockwise direction from d1 through the shortest-diameter-last direction (i.e. the shortest diameter is d4). To use simple attribute to characterize open angle of CD14, the last fourteen elements in Y are the O4-centroid-O4 angles (αj) starting from the O4 oxygen of the longest diameter and rotating clockwise/counter-clockwise via the shortest-diameter-last algorithm. For brevity, let CkN denote

ACS Paragon Plus Environment

12

Page 13 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

a resulting class of CD14 structures from GMM clustering, where N is the number of clusters specified in the GMM and k denotes the k-th class in descending order of population size.

3. Results and discussion 3.1 Conformation analysis results In order to evaluate an effectiveness of the proposed measurement methods, the results of circularity, biplanar angle, and ODCs analyses applied on CD14 conformations from both implicit and explicit solvation simulations were presented in section 3.1.1, 3.1.2, and 3.1.3, respectively. 3.1.1 Circularity In this section CD14 conformational diversity is analyzed in terms of circularity, γc2. Figure 3 shows superimposition of CD14 snapshots based on their measured γc2. We found 3 major groups of conformation based on the circularity, which are: (1) 8-shape/twisted shape; (2) elliptical; and (3) circular. For γc2 < 0.2 (see Figures 3(a)-(b)), CD14 forms an 8-shaped conformation, in which two sides of the ring are interacting with each other near the center of the ring. With this conformation two smaller cavities are present instead of one large cavity. These cavities are observed aligned in parallel and in perpendicular planes to each other. For 0.2 < γc2 < 0.7 (Figures 3(c)-(g)), CD14 forms an elliptical shape. With this elliptical conformation more band flips are observed for γc2 = 0.2 - 0.4 than for γc2 = 0.4 - 0.7. The last major CD14 conformation is almost circular and occurs when γc2 > 0.7 (see Figures 3(h)-(j)).

ACS Paragon Plus Environment

13

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Conformation

Page 14 of 29

Sample snapshots (a)

(b)

0.0< γ c2