Binding Site Preorganization and Ligand Discrimination in the Purine

Jul 11, 2014 - Binding Site Preorganization and Ligand Discrimination in the Purine .... of the CRISPR/Cas9 gene-editing technology, convened a meetin...
0 downloads 0 Views 1MB Size
Subscriber access provided by DALHOUSIE UNIV

Article

Binding Site Preorganization and Ligand Discrimination in the Purine Riboswitch Johan Sund, Christoffer Lind, and Johan Aqvist J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/jp5052358 • Publication Date (Web): 11 Jul 2014 Downloaded from http://pubs.acs.org on July 11, 2014

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Binding Site Preorganization and Ligand Discrimination in the Purine Riboswitch

Johan Sund†, Christoffer Lind† and Johan Åqvist* Department of Cell and Molecular Biology, Uppsala University, Biomedical Center, Box 596, SE–751 24 Uppsala, Sweden

†These authors contributed equally to this work.

* Corresponding author: E-mail: [email protected]. Phone: +46 18 471 4109 Fax: +46 18 530396

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 34

Abstract The progress of RNA research has suggested a wide variety of RNA molecules as possible targets for pharmaceutical drug molecules. Structure-based computational methods for predicting binding modes and affinities are now important tools in drug discovery, but these methods have mainly been focused on protein targets. Here we employ molecular dynamics free energy perturbation calculations and the linear interaction energy method to analyze the energetics of ligand binding to purine riboswitches. Calculations are carried out for 14 different purine complexes with the guanine and adenine riboswitches in order to examine their ligand recognition principles. The simulations yield binding affinities in good agreement with experimental data and rationalize the selectivity of the riboswitches for different ligands. In particular, it is found that these receptors have an unusually high degree of electrostatic preorganization for their cognate ligands and this effect is further quantified by explicit free energy calculations, which show that the standard electrostatic linear interaction energy parametrization is suboptimal in this case. The adenine riboswitch specifically uses the electrostatic preorganization to discriminate against guanine by preventing the formation of a G-U wobble base pair.

Keywords: molecular dynamics simulation; free energy perturbation; linear response approximation; linear interaction energy

2

ACS Paragon Plus Environment

Page 3 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Introduction In recent years it has become evident that RNA is much more versatile than posited by the traditional view, where its main functions were associated with mRNAs, tRNAs and rRNAs. A particularly interesting type of RNA molecules has been named riboswitches as they have the ability to switch gene expression at special physiological conditions, for example at different concentrations of small molecules. The purine riboswitches were first identified in 20031 when conserved sequences in the 5’ untranslated region upstream of genes participating in purine metabolism were investigated. A closer look at a sequence in the 5’ untranslated region of the xpt-pbuX operon in Bacillus subtilis, which encodes for xanthine phosphoribosyltransferase and a xanthine-specific permease, showed that it had the ability to bind to guanine with high affinity but not to adenine and several other purines. It was concluded that the sequence is a guanine dependent riboswitch. In a subsequent study the same authors reported one sequence found in Bacillus subtilis, associated with the ydhL gene that encodes for a putative purine efflux pump, which instead showed high affinity for adenine and discriminated against guanine.2 The changed specificity was concluded to originate from one single mutation where cytosine 74 (C74) in the guanine riboswitch (GR) corresponded to a uracil (U74).

The purine riboswitches act by changing the fold of mRNA upon ligand binding in ways that change the transcription of the genes they control, but whereas the xpt guanine riboswitch (GR) turns off transcription upon guanine binding, the ydhL adenine riboswitch turns it on in complex with adenine. Mechanisms that regulate translation initiation rather than transcription have also been found for an adenine riboswitch associated with the add gene encoding for adenine deaminase, where adenine binding changes the fold of the mRNA to a structure that enables the Shine-Dalgarno sequence to bind to the ribosome.3 The structural explanation for the transcription interference given by Mandal et al.2 was a rearrangement of the mRNA upon ligand binding that created a terminator stem for GR and 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 34

removed the terminator stem for the adenine riboswitch. The mRNA rearrangement was proposed to create a pocket that was in contact with all functional groups of the purine and able to form a WatsonCrick base pair to C74 for guanine and the corresponding U74 for the adenine riboswitch. This behaviour was proven be correct when the first crystal structures of the ligand binding domain of the purine riboswitches were published.3,4 The ligand was almost completely sequestered by the mRNA contacts (Figure 1), with all hydrogen bonds satisfied, which suggested an induced fit mechanism of binding. The binding pocket is thus characterized by extensive hydrogen bonding between the riboswitch and the ligand (Figure 2). Here, besides the key bases C74 and U74 in the guanine and adenine riboswitches, respectively, the 2’-OH group of U22 interacts with the Hoogsteen edge of the ligand and while U51 interacts with the sugar edge (Figure 2).

Several crystal structures of the purine binding region of the xpt-pbuX operon GR in Bacillus subtilis and the C74U mutant (GRA) in complex with different ligands have been solved and published together with binding data.5,6 This makes it an ideal test system for molecular modelling and structure based RNA-ligand binding prediction tools. The purine riboswitch has previously been investigated by molecular dynamics (MD) simulations7-12 and molecular docking13-15. The program DOCK was able to separate binders from non-binders and identify new ligands by virtual screening, even though the correlation between predicted and observed binding affinities was not high.15 However, no quantitative free energy simulations of binding energetics have yet been reported. In computational ligand design, in general, the focus has mainly been on predicting the affinity of small organic molecules to the active sites of proteins. The development of tools for prediction of ligand binding affinities for RNA systems has thus been lagging behind (reviewed in Refs. 16,17) although attempts with docking, scoring and free energy calculations have been made.18-20 The progress of RNA research opens up a new field of targets for drug-like molecules and suitable tools for ligand binding prediction are therefore needed.

4

ACS Paragon Plus Environment

Page 5 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Here, we analyze the binding energetics of 14 different purine complexes with the GRA and GR riboswitches with molecular dynamics (MD) free energy simulations, utilizing both the rigorous free energy perturbation (FEP) and approximate linear interaction energy (LIE) techniques. The complexes considered are purine (PUR) and its analogs adenine (ADE), 6-chloroguanine (ClG), 6-bromoguanine (BrG), 2,6-diaminopurine (DAP), 2-aminopurine (2AP), 2-fluoroadenine (2FA), guanine (GUA) and hypoxanthine (HYP) binding to GRA and the analogs GUA, HYP, DAP and 2AP binding to GR (Figure 2). The results show that the riboswitches have an unusually high degree of electrostatic preorganization for their ligands which is essential to their functional recognition and discrimination of different purine analogs.

Methods Molecular dynamics simulations were performed with the program Q21 using the Amber99_bsc0 force field22-24 for RNA residues. Generalized Amber force field parameters23 for the ligands were extracted from the Antechamber program in the Ambertools 1.0 package,25 but angles and torsions for planar amines were taken from Amber 99. RESP charges26 for all ligands were generated using Gaussian 09 optimizations27 at the HF/6-31G* level and Antechamber. The initial RNA coordinates for all simulations were taken from the high resolution crystal structure of GRA in complex with a triazolotriazole-diamine ligand15 (PDB code 2XNW). The GR simulations were started from the same structure with residue 74 changed from U to C. The starting structures of the ligands were based on the crystal structures 1U8D,4 1Y26,3 and 3FO4.6

Simulations were performed for all ligands with SCAAS spherical boundary conditions21,28 with a 24 Å radius water sphere centered on the C5 atom of DAP (Figure 1). The outer 9 Å of the sphere was neutralized by scaling down the partial charges of the phosphate groups of the nucleotides. This part of 5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 34

the RNA was also restrained to its initial positions with harmonic restraints during the simulations. Two Mg2+ ions close to the ligand were added at sites corresponding to one Co(NH3)63+ and one K+ ion in the crystal structure. All water molecules in the crystal structure, except one that clashed with many of the ligands, were kept and additional water molecules were added from a grid to give a sphere solvated with TIP3P water molecules.29 Non-bonded interactions except those involving the ligands were truncated using cutoffs. A cutoff radius of 12 Å was used for RNA-water and RNA-RNA interactions while 10 Å was used for water-water interactions. Electrostatic interactions beyond these cutoffs were treated with the local reaction field method.30 The systems were equilibrated for 0.5 ns after stepwise heating from 1 to 303.15 K with restraints on the solute that were gradually decreased.

The FEP scheme is based on Zwanzig’s formula31 which can be used to calculate the free energy change associated with the transformation of a molecule A into another molecule B in n discrete steps.32 Disregarding the completely negligible p∆V difference between Helmholtz and Gibbs free energies the total free energy change for such a transformation is given by

n −1

∆GA→ B = − RT ∑ ln e − (Ui+1 −Ui ) / RT i =1

i

(1)

where R is the gas constant, T is the temperature and Ui represents the potential energy of step i and the brackets denote configurational averaging by MD. The steps are defined in terms of different linear combinations between the initial (A) and final (B) states

Ui = (1 − λi )U A + λiU B

where the coupling parameter λ is successively varied from 0 to 1 in discrete increments. By 6

ACS Paragon Plus Environment

(2)

Page 7 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

performing one such transformation in water and one in complex with the receptor, the relative binding free energy (∆∆Gbind) between molecule A and B be calculated via a standard thermodynamic cycle as

B A ∆Gbind − ∆Gbind = ∆∆Gbind = ∆GAcomplex − ∆GAwater →B →B

(3)

The free energy calculations were performed with 57 windows (λ-steps) each with 40 ps of sampling. The simulation temperature at data collection was 303.15 K and the MD time step 1 fs. SHAKE constraints33 were only used for bonds and angles of the water molecules and softcore potentials34 were employed during the transformations to prevent singularities. Corresponding simulations of the free ligands were performed in a 19 Å sphere of water, but with 30 ps of simulation at each λ-step. The ligands were then restrained to their center of mass during all simulations in solution. The following transformations were performed in GRA: DAP → ADE, DAP → ClG, DAP → 2AP, DAP → 2FA, DAP → GUA, ADE → PUR, ADE → HYP and ClG → BrG. In GR the transformations GUA → DAP, GUA → HYP, GUA → 2AP and HYP → ADE were carried out. These transformations were selected in order to give as small differences as possible between the end states. The free energy calculations were repeated seven times with different initial velocities. In addition, 2 ns MD simulations for LIE data collection were also started from the end states of the FEP calculations, for each of the seven replicas. To further investigate electrostatic contributions to binding, FEP calculations where all electrostatic interactions between the ligand and its surrounding were gradually turned off were also performed for all ligands both in water and in the complex. These free energy perturbation runs were performed in four replicas with 51 λ-steps of 40 ps of sampling at each window.

Besides rigorous FEP and thermodynamic integration (TI) calculations, a useful tool for prediction of the free energy of binding for a ligand to an active site is the linear interaction energy method.35 It uses

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 34

two MD simulations, one of the solvated ligand-receptor complex and one of the free ligand in water, el to calculated the difference (∆) in van der Waals ( U lvdW − s ) and electrostatic ( U l − s ) interaction energies

between the ligand and the surrounding environment (subscript l-s). These energy differences are then scaled with appropriate coefficients, α and b, for the non-polar and polar contributions, respectively, and a system dependent constant term, γ, is also added to set the absolute free energy scale

∆Gbind = α∆ U lvdW + b∆ U lel− s + γ −s

(4)

The standard parameterization36 with α = 0.18 and b = 0.33-0.5 depending on the character of the ligand has successfully been used for predicting the affinity of drug like molecules to protein targets32 and also for the prediction of the affinity between tRNA and mRNA codons on the ribosome.37

The b parameter was originally set to ½ in accordance with the linear response approximation (LRA)30,38 which represents the electrostatic part of the solvation free energy associated with turning on the electrostatic interactions of a molecule in a given environment. Such a calculation requires molecular dynamics simulations of two states, one with the electrostatic interactions between the (solute) molecule and its surroundings turned on and one with these interactions turned off. LRA then predicts the corresponding free energy difference to be given by

LRA ∆G = el

where U lel− s

off

{

1 U lel− s 2

off

+ U lel− s

on

}

(5)

denotes the average electrostatic energy that would have been obtained for a trajectory

sampled on the Off-state, had the interactions instead been turned on. LRA is thus a two-state 8

ACS Paragon Plus Environment

Page 9 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

approximation to rigorous multistate free energy calculations like TI or FEP.32 In LIE the Off-term is approximated to zero. This approximation is generally valid for a liquid environment where dipoles will be randomly distributed around a solute molecule simulated with charges turned off. In a protein active site where residues may be preorganized for the ligand this approximation is less obvious and, although it has been found to be reasonable for a number of test cases with drug like molecules,39 exceptions have also been observed.39,40 The value of the LIE parameter b has been brought to attention in a number of studies. In a work examining the validity of LRA in polar solvents38 it was found that

b = 0.5 is valid for ions but gives an overestimation of the free energy for neutral molecules and especially for compounds with hydroxyl groups that can form hydrogen bonded networks with the solvent. This led to the development of ligand dependent schemes for selecting b-values in better agreement with rigorous FEP calculations.36,41

9

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 34

Results and Discussion The binding between the GR and GRA riboswitches and different purine analogs have been extensively studied by X-ray crystallography and isothermal titration calorimetry.5,6 All ligands included in this study are very similar in size and differ only by one to two heavy atoms, which are small enough changes for relative binding free energies to be obtained by MD/FEP simulations with high precision. From our twelve FEP calculations for selected ligand pairs in the two riboswitches we can obtain 46 relative binding free energies (∆∆Gbind), which are given in Figure 3A together with the corresponding experimental values, also obtained at 30° C. The calculations yield a reasonable overall mean unsigned error (MUE) of 1.46 kcal/mol for the 27 ligand pairs for which experimental data exists. Here, the twelve original underlying calculations, eight for GRA and four for GR, yielded MUEs of 1.08 kcal/mol and 1.93 kcal/mol, respectively. The results are also illustrated in Figure 3B where the calculated vs. experimental values are plotted. As can be seen, a distinct feature of these free energy calculations is that there appears to be a systematic underestimation of ∆∆Gbind by about 2 kcal/mol for about half of the compound pairs examined.

Interestingly, when examining which pairs fall above the perfect correlation line in Figure 3B, it turns out that these precisely correspond to pairs where one ligand was measured in Ref. 5 and the other in Ref. 6. In contrast, the calculated relative binding free energies between ligands measured in the same experiment (either Ref. 5 or Ref. 6) generally agree very well with the experimental data. It can also be noted here that no reference compound was measured in both of the isothermal calorimetry experiments. The only exception to the above pattern are the ∆∆Gbind values associated with purine (PUR) in GRA (Figure 3B). This is both the smallest and the least hydrophilic compound in the dataset and no crystal structure of PUR or any relevant analog in complex with GRA exists, which makes its binding mode rather uncertain. It can further be noted that PUR showed a ~1 kcal/mol higher affinity to 10

ACS Paragon Plus Environment

Page 11 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

the very similar adenine riboswitch binding site in earlier studies,2 which would reduce the discrepancy with our calculated values. In order to examine the possibility that our simulation protocol might be the cause of the problem, we carried out a number of different checks. These included running supplementary MD/FEP simulations of length 20 ns (Supporting Information Figure S1), performing additional calculations with the Charmm22 force field42,43 and neutralizing the system by scaling down phosphate charges for selected pairs of ligand complexes. However, neither of these changes had any significant effect on the overall results in Figure 3. Therefore we conclude that the most likely source of the systematic underestimation of part of the ∆∆Gbind values is to be found in experimental uncertainties as discussed above.

Three compounds that according to Gilbert et al. experimentally show no binding5,6 have also been analyzed, namely ADE binding to GR and GUA and HYP binding to GRA. These compounds are of biological interest as they all are natural metabolites that the riboswitch has to be able to discriminate against. From Figure 3A it is evident that GUA and HYP are predicted to bind worse than the highest affinity ligand (DAP) to GRA by as much as 5.4 and 7.4 kcal/mol, respectively. Likewise, the discrimination between GUA and ADE in GR is predicted to be 6.4 kcal/mol (Figure 3A). That is, the discrimination against the experimentally undetectable non-binders is of the order of 10000 or more. Hence, we can conclude that the MD/FEP calculations are indeed able to reliably sort binders from non-binders with closely related structures.

The overall binding modes and interaction patterns are very conserved between the ligands and the average MD structures are in excellent agreement with the known crystal structures of GUA, HYP and 2AP in GR (PDB codes 1Y27,3 1U8D4 and 3G4M)6, DAP, ClG and 2FA in GRA (PDB codes 2B57,5 3FO46 and 3GOT)6 and ADE in the add A riboswitch (PDB code 1Y26)3. Comparing the average MD structures to the crystal structures yields an average ligand RMSD of only 0.4 Å for these complexes. 11

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 34

As noted earlier,6 hydrogen bonds play a completely dominant role in defining the binding modes. The ligands are stabilized mainly by interactions with three nucleotides, namely U74 in GRA , corresponding to C74 in GR, and the conserved U22 and U51 nucleotides (Figure 2). Here, U51 provides hydrogen bonding with its base, while U22 donates a hydrogen bond from its 2’-hydroxyl group. As noted above, the single base change between C74 (GR) and U74 (GRA) determines the riboswitch specificity through Watson-Crick base-pairing with its cognate ligand.

A general feature of all ligand complexes lacking the 2-amino substituent is that a conserved water molecule, seen in the highest resolution crystal structures,4 moves in to bridge the (otherwise repulsive) interaction between the carbonyl groups of U/C74 and U51 (Figure 2). In complexes that retain the 2amino substituent, it is this group that bridges the above interaction and both the calculations and experimental data show that this amino group contributes 2-3 kcal/mol to the binding affinity. Similarly, in complexes lacking a substituent at C6 (PUR and 2AP) two other conserved water molecules4 also able to bridge the interaction between O4/N4 in U/C74 and the 2’-oxygen of U22 (Figure 2). The calculations indicate that the correct C6 substituent (6-amino in GRA and O6 in GR ligands) is worth 1-2 kcal/mol in terms of binding affinity. The largest disruption of the GRA riboswitch structure is, as expected, seen for the complexes with the GUA and HYP non-binders. In these cases U74 has to shift its position (upwards in Figure 2) to avoid the repulsion between protonated ring nitrogens on both the base and ligand. Conversely in GR, the C74 base has to shift its position (downwards in Figure 2) in the complexes with the non-cognate DAP and 2AP in order to only lose one hydrogen bond (and similarly for ADE).

To further analyze the energetics of ligand binding to the GR and GRA riboswitches we carried out additional MD simulations from the end-points of the FEP calculations and employed the linear interaction energy (LIE) method35,44 to estimate absolute binding free energies for all complexes. The 12

ACS Paragon Plus Environment

Page 13 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

standard parametrization of this method36 has proven reliable for a number of different systems including ligand binding to numerous enzymes,45 ion channels46 and even codon reading on the ribosome.37 A recent study also demonstrated the validity of this model for a large and very diverse set of drugs and inhibitors of the COX-1 enzyme.47 In the present case, the standard parametrization of eq. 4 corresponds to α = 0.18 and b = 0.43 for all riboswitch ligands, while the parameter γ is optimized (separately for GR and GRA) to fix the absolute binding free energy scale. The results of these LIE calculations in terms of absolute binding free energies are shown in Figure 4A with the detailed energetics given in Tables 1 and 2. As is immediately evident from Figure 4A the standard LIE model gives rather uniform binding affinities with poor predictivity, even though the MUEs are small (1.06 kcal/mol and 1.39 kcal/mol for GRA and GR, respectively). From Tables 1 and 2 it can also be seen that while GUA and ADE are indeed predicted to be the worst binders to GRA and GR, respectively, the discrimination against these compounds is too small, and even more so for HYP in GRA. Furthermore, reoptimization of α and b in eq. 4 does not alleviate this problem, which clearly indicates that something is fundamentally different with the riboswitch binding sites compared to receptor-ligand systems that have been analyzed earlier with this method.

The standard LIE parameterization assumes that the electrostatic response to the ligand is similar in water and the receptor complex. The value b = 0.43 for the compounds studied here, derived from FEP calculations in water,38 reflects a minor deviation from the linear response approximation which strictly predicts b = ½ . However, if the electrostatic response cannot be represented by a single b value, that is if it differs significantly between the bound and free states of the ligand, then eq. 4 is not valid. This possibility has been examined before by generalizing the LIE equation so that the receptor and solvent electrostatic response are treated by two separate parameters36

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

∆Gbind = + b complex U lel− s a∆ U lvdW −s

bound

− b water U lel− s

free

Page 14 of 34



(6)

However, the conclusion from earlier studies where this type of model has been analyzed was that

bcomplex usually is very close to bwater and that the use of a single b-value is indeed valid.36,48,49

In view of the seemingly peculiar behaviour of the riboswitches we decided to explicitly calculate the values of bcomplex and bwater for all compounds. This was done as reported earlier38,41 by MD/FEP simulations where the electrostatic intermolecular interactions between the ligand and its surroundings are successively turned on in the two different environments (i.e., akin to charging the compounds except that the intramolecular interactions are always present). These FEP simulations are thus carried out such that λ=1 in eq. 2 corresponds to a simulation on the fully charged potential (UB) and λ=0 corresponds to the potential without ligand-surrounding electrostatic interactions (UA). The corresponding electrostatic free energy terms then define the b parameters in eq. 6 for each ligand through

∆GelFEP = β U lel− s

l =1

(7)

where the resulting b values may be different in the complex (bcomplex) and in water (bwater). The results from these calculations are given in Tables 1 and 2 in terms of ∆GelFEP , bcomplex and bwater for all compounds studied. It is immediately evident that while the bwater values are as expected uniform and around 0.40, the bcomplex values are considerably higher and more variable with an average for the measured binders of about 0.63. This behavior thus explains the poor predictivity of the standard LIE model. Furthermore, since the bcomplex values are larger than ½ for all binders this indicates that the 14

ACS Paragon Plus Environment

Page 15 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

U lel− s

The Journal of Physical Chemistry

off

term in eq. 5 contributes significantly to ∆GelFEP , which is in fact the definition of a highly

preorganized binding site.50,51

In order to further illustrate to what extent electrostatic linear response is actually valid in the different environments (receptor binding site and water) it is useful to plot the derivative of the free energy versus the electrostatic charging parameter λ. This derivative depends implicitly on λ and is for every value of λ equal to the average energy gap between the end-point potentials, evaluated on the particular intermediate potential defined by eq. 2.32 It is also the quantity used in the thermodynamic integration formulation of the free energy change (again neglecting the difference between Helmholtz and Gibbs free energies)

1



∆GelTI =

0

where U lel− s

l

1

U B − U A l d ll = ∫ U lel− s d

(8)

l

0

denotes the average of the full intermolecular electrostatic potential (corresponding to

UB), but evaluated for a particular value of λ where only a fraction of it is turned on. The FEP data can thus also be used for TI calculations and the free energy change will then be the area above the U lel− s curve (since it is negative). According to the two-state LRA model (eq. 5) the area above the curve is approximated using a straight line as

1

∫ 0

U lel− s d l ≈ l

(

1 U lel− s 2

0

) 12 ( U

+ U lel− s = 1

el l − s off

while the “one-state” LIE model approximates the above equation by 15

ACS Paragon Plus Environment

+ U lel− s

on

)

(9)

l

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1



U lel− s d lβ ≈ U lel− s l

Page 16 of 34

(10)

1

0

It should again be noted that the value of b has been shown to depend on the chemical nature of the molecule and this dependency can both reflect the failure of LRA (i.e., the true scaling factor differs from ½ in eq. 8) and the fact that U lel− s

0

may not be exactly zero even in water.38 For LIE binding

calculations, it may further be noted that even if U lel− s

0

is not zero in the receptor binding site, but a

constant term or trivially size-dependent for a series of ligands, it can still be taken care of by the α and γ parameters which represent such terms.52

In Figure 5A,B two examples of such TI plots for the charging of a K+-ion and the hydroxyl rich galactose molecule in water are shown together with straight lines corresponding to the LRA and LIE approximations to illustrate the behavior of different compounds. It can clearly be seen that a straight line approximation is valid for the ion, but that a lower slope (smaller β) is needed to be able to approximate the galactose curve with a linear regression. Hence, the LRA approximation is not accurate for this type of compound in water.38 Both plots in Figure 5A,B show that U lel− s

0

is close to

zero and the Off-term does thus not contribute to the free energy. Figure 5C,D show the corresponding plots for DAP and GUA in water. The behavior here is somewhere between K+ and galactose with no Off-term and a curvature that requires a b lower than ½, in agreement with the standard parameterization.36 Again, it can be seen that LRA is not particularly accurate for these compounds in water. When the charging of DAP and GUA is examined in complex with GRA (Figure 6A,B) the plots show a distinctly different behavior. Both curves are well approximated by straight lines, indicating the

16

ACS Paragon Plus Environment

Page 17 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

validity of linear response. However, whereas the GUA curve crosses the y-axis close to the origin,

U lel− s

0

(the Off-term in eq. 5) for DAP is clearly negative indicating a significant electrostatic

preorganization effect. That is, the residual negative value of U lel− s when the electrostatic interactions 0

are turned off (λ=0) reflects a persistent non-randomized orientation of the ligand dipoles with respect to those of the surroundings, i.e., the riboswitch and water molecules. This clearly shows, not only that the binding site is highly structured (preorganized) for the ligand, but also that the ligand cannot relax and sample randomized orientations in the electrostatically decoupled state. Furthermore, it explicitly demonstrates the origin of the high bcomplex = 0.61 for DAP and also shows that the discrimination against GUA, for which bcomplex = 0.53, is mainly caused by the preorganization effect. This is also evidenced by the average structures shown in Figure 6, where it can be seen that the binding modes corresponding to λ=0 (electrostatic interactions turned off) for DAP and GUA (Figure 6E,F) are essentially identical to that of the “normal” interacting DAP (Figure 6C). In the case of GUA, however, the riboswitch has to relax significantly when its interactions are turned on (Figure 6D).

With the explicitly calculated values of bcomplex and bwater for all compounds it is, of course, possible to insert these into a more accurate version of the LIE equation, corresponding to eq. 6 but with each ligand having its specific (and exact) bcomplex and bwater parameters. Figure 4B shows the results of using this approach with α = 0.18 as usual, so that the only free parameter is the constant term γ used to set the absolute free energy scale. The performance of this LIE model (denoted LIE-FEP in Tables 1 and 2) is thus rather good and the MUE for the absolute binding free energies is as low as 0.80 kcal/mol for GRA and 1.13 for GR. However, it can be noted that PUR is an outlier just as in the FEP calculations, again indicating that there is some problem with the binding data for this compound. It may also be noted that the set of 27 relative binding free energies, which can be compared to the experimental data, obtained with this model (Figure S2) shows a distinct correlation with the corresponding values from 17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 34

the FEP calculations (Figure 3B).

Conclusions The calculations performed here show a large electrostatic preorganization contribution to ligand binding due to the rather unusual composition of the riboswitch binding sites which almost completely encapsulate the ligands. For cognate ligands the possible hydrogen bonding capacity is entirely fulfilled with up to seven hydrogen bonds to polar ligand atoms, as well as large favorable van der Waals interactions. The snug fit to the common scaffold, together with the fact that the compounds do not have any internal conformational degrees of freedom, limits the possibility of relaxation50 which would otherwise reduce the preorganization term ( U lel− s

Off

). It is clear from the calculated energetics that in

the GRA riboswitch it precisely this preorganization that provides the main discriminatory power against the non-binders GUA and HYP, for which the calculated values of bcomplex are significantly smaller than for compounds with measurable binding affinity. In the GR riboswitch, on the other hand, the discrimination against ADE is of more classical nature and mainly due to unfavorable electrostatic interactions due to loss of hydrogen bonding. The present free energy calculations thus show that GR and GRA actually discriminate between purines by different mechanisms. Whereas the A-C mismatch is enough for GR to discriminate against ADE, GRA needs a fold with a preorganization that prevents standard G-U wobbling for the discrimination against GUA.

The fact that the riboswitch preorganization terms are variable and non-negligible makes the standard linear interaction energy model rather unpredictive in this case. However, we have shown here that the more elaborate LIE equation (eq. 6) with explicit evaluation of bcomplex and bwater yields a very reasonable model that correctly captures the energetics of ligand binding. This is actually further 18

ACS Paragon Plus Environment

Page 19 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

corroborated by the fact that purine, as discussed above, is an outlier with overestimated binding affinity in all our calculations. Moreover, the predicted relative binding free energies for the 27 ligand pairs shown in Figure 3B indeed show a very similar pattern with this LIE-FEP model (Figure S2). As discussed above our data suggest there may be problems with the compatibility of the two sets of experimental measurements, which appears to be the most likely source of the systematic deviation with respect to the free energy calculations.

Since many protein-ligand systems have been shown to obey our standard parametrization of the LIE method this suggests that these systems do not display a high degree of preorganization, at least not for the compound series studied. It is therefore interesting to ask what is so special about the present riboswitches. First, it should be noted that the electrostatic preorganization concept53 does not only involve the receptor site, but refers to both receptor and ligand. That is, a site is preorganized for a particular type of ligand but not for others (see, e.g., GUA and HYP in Table 1). Hence, it is perhaps not so strange that preorganization is observed for complexes upon which evolution has acted, but not necessarily for man-made drug-like compounds. A case in point here are the non-nucleoside HIV-1 reverse transcriptase inhibitors which clearly obeyed eq. 4,48 but had never been seen by evolution before their discovery. In contrast, it can be noted that large preorganization terms have also been observed for DNA mismatches in human DNA Polymerase b.40 Another key aspect is that efficient preorganization is clearly more difficult to attain towards larger and more flexible ligand molecules, as well as with flexible binding sites. Also with more nonpolar interactions, which are intrinsically less specific than hydrogen bonds, the effect would seem more difficult to attain. This may suggest that natural RNA receptors in general, and particularly in the context of binding small rigid molecules, are likely to show a high degree of preorganization towards their cognate ligands.

19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 34

Supporting Information Figures showing additional calculations of relative binding free energies from FEP and LIE simulations. This material is available free of charge via the Internet at http://pubs.acs.org.

Acknowledgements Support from the Swedish Research Council (VR), the eSSENCE e-science initiative and the Swedish Infrastructure for Computing (SNIC) is gratefully acknowledged. It is a pleasure to contribute this paper on the occasion of Prof. Bill Jorgensen’s 65th birthday. Bill has played an instrumental role in bringing biomolecular computer simulations into the area of drug design and has convincingly shown the power of these techniques in the design of novel compounds.

References (1)

Mandal, M.; Boese, B.; Barrick, J. E.; Winkler, W. C.; Breaker, R. R. Riboswitches Control Fundamental Biochemical Pathways in Bacillus Subtilis and Other Bacteria. Cell 2003, 113, 577–586.

(2)

Mandal, M.; Breaker, R. R. Adenine Riboswitches and Gene Activation by Disruption of a Transcription Terminator. Nat. Struct. Mol. Biol. 2004, 11, 29–35.

(3)

Serganov, A.; Yuan, Y.; Pikovskaya, O.; Polonskaia, A.; Malinina, L.; Phan, A.; Hobartner, C.; Micura, R.; Breaker, R.; Patel, D. Structural Basis for Discriminative Regulation of Gene Expression by Adenine- and Guanine-Sensing mRNAs. Chem. Biol. 2004, 11, 1729–1741.

(4)

Batey, R. T.; Gilbert, S. D.; Montange, R. K. Structure of a Natural Guanine-Responsive Riboswitch Complexed with the Metabolite Hypoxanthine. Nature 2004, 432, 411–415.

(5)

Gilbert, S. D.; Stoddard, C. D.; Wise, S. J.; Batey, R. T. Thermodynamic and Kinetic Characterization of Ligand Binding to the Purine Riboswitch Aptamer Domain. J. Mol. Biol. 2006, 359, 754–768.

(6)

Gilbert, S. D.; Reyes, F. E.; Edwards, A. L.; Batey, R. T. Adaptive Ligand Binding by the Purine Riboswitch in the Recognition of Guanine and Adenine Analogs. Structure 2009, 17, 857–868. 20

ACS Paragon Plus Environment

Page 21 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(7)

The Journal of Physical Chemistry

Villa, A.; Wöhnert, J.; Stock, G. Molecular Dynamics Simulation Study of the Binding of Purine Bases to the Aptamer Domain of the Guanine Sensing Riboswitch. Nucleic Acids Res. 2009, 37, 4774–4786.

(8)

Sharma, M.; Bulusu, G.; Mitra, A. MD Simulations of Ligand-Bound and Ligand-Free Aptamer: Molecular Level Insights Into the Binding and Switching Mechanism of the Add AvRiboswitch. RNA 2009, 15, 1673–1692.

(9)

Priyakumar, U. D.; MacKerell, A. D. Role of the Adenine Ligand on the Stabilization of the Secondary and Tertiary Interactions in the Adenine Riboswitch. J. Mol. Biol. 2010, 396, 1422– 1438.

(10)

Gong, Z.; Zhao, Y.; Chen, C.; Xiao, Y. Role of Ligand Binding in Structural Organization of Add a-Riboswitch Aptamer: a Molecular Dynamics Simulation. J. Biomol. Struct. Dyn. 2011, 29, 403–416.

(11)

Allnér, O.; Nilsson, L.; Villa, A. Magnesium Ion–Water Coordination and Exchange in Biomolecular Simulations. J. Chem. Theory Comput. 2012, 8, 1493–1502.

(12)

Allnér, O.; Nilsson, L.; Villa, A. Loop-Loop Interaction in an Adenine-Sensing Riboswitch: a Molecular Dynamics Study. RNA 2013, 19, 916–926.

(13)

Ling, B.; Wang, Z.; Zhang, R.; Meng, X.; Liu, Y.; Zhang, C.; Liu, C. Theoretical Studies on the Interaction of Modified Pyrimidines and Purines with Purine Riboswitch. J. Mol. Graph. Model. 2009, 28, 37–45.

(14)

Ling, B.; Zhang, R.; Wang, Z.; Dong, L.; Liu, Y.; Zhang, C.; Liu, C. Theoretical Studies on the Interaction of Guanine Riboswitch with Guanine and Its Closest Analogues. Mol. Simul. 2010, 36, 929–938.

(15)

Daldrop, P.; Reyes, F. E.; Robinson, D. A.; Hammond, C. M.; Lilley, D. M.; Batey, R. T.; Brenk, R. Novel Ligands for a Purine Riboswitch Discovered by RNA-Ligand Docking. Chem. Biol. 2011, 18, 324–335.

(16)

Foloppe, N.; Matassova, N.; Aboul-ela, F. Towards the Discovery of Drug-Like RNA Ligands? Drug Discov. Today 2006, 11, 1019–1027.

(17)

Fulle, S.; Gohlke, H. Molecular Recognition of RNA: Challenges for Modelling Interactions and Plasticity. J. Mol. Recognit. 2010, 23, 220–231.

(18)

Lang, P. T.; Brozell, S. R.; Mukherjee, S.; Pettersen, E. F.; Meng, E. C.; Thomas, V.; Rizzo, R. C.; Case, D. A.; James, T. L.; Kuntz, I. D. DOCK 6: Combining Techniques to Model RNASmall Molecule Complexes. RNA 2009, 15, 1219–1230.

(19)

Pfeffer, P.; Gohlke, H. DrugScore(RNA) Knowledge-Based Scoring Function to Predict 21

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 34

RNA−Ligand Interactions. J. Chem. Inf. Model. 2007, 47, 1868–1876. (20)

Gouda, H.; Kuntz, I. D.; Case, D. A.; Kollman, P. A. Free Energy Calculations for Theophylline Binding to an RNA Aptamer: Comparison of MM-PBSA and Thermodynamic Integration Methods. Biopolymers 2002, 68, 16–34.

(21)

Marelius, J.; Kolmodin, K.; Feierberg, I.; Aqvist, J. Q: A Molecular Dynamics Program for Free Energy Calculations and Empirical Valence Bond Simulations in Biomolecular Systems. J. Mol. Graph. Model. 1998, 16, 214–225.

(22)

Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 1995, 117, 5179–5197.

(23)

Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–1174.

(24)

Pérez, A.; Marchán, I.; Svozil, D.; Sponer, J.; Cheatham, T. E., III; Laughton, C. A.; Orozco, M. Refinement of the AMBER Force Field for Nucleic Acids: Improving the Description of α/γ Conformers. Biophys. J. 2007, 92, 3817–3829.

(25)

Case, D. A.; Darden, T. A.; Iii, T. C.; Simmerling, C. L. Amber 10. University of California, San Francisco, USA, 2008.

(26)

Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: the RESP Model. J. Phys. Chem. 1993, 97, 10269–10280.

(27)

Frisch, M. J. et al. Gaussian 09, Revision A. 1; Gaussian: Wallingford, CT, 2009.

(28)

King, G.; Warshel, A. A Surface Constrained All-Atom Solvent Model for Effective Simulations of Polar Solutions. J. Chem. Phys. 1989, 91, 3647.

(29)

Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926–935.

(30)

Lee, F. S.; Warshel, A. A Local Reaction Field Method for Fast Evaluation of Long-Range Electrostatic Interactions in Molecular Simulations. J. Chem. Phys. 1992, 97, 3100.

(31)

Zwanzig, R. W. High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22, 2099.

(32)

Brandsdal, B. O.; Österberg, F.; Almlöf, M.; Feierberg, I.; Luzhkov, V. B.; Aqvist, J. Free Energy Calculations and Ligand Binding. Adv. Protein Chem. 2003, 66, 123–158.

(33)

Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian 22

ACS Paragon Plus Environment

Page 23 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Equations of Motion of a System with Constraints: Molecular Dynamics of N-Alkanes. J. Comput. Phys. 1977, 23, 327–341. (34)

Zacharias, M.; Straatsma, T. P.; McCammon, J. A. Separation-Shifted Scaling, a New Scaling Method for Lennard-Jones Interactions in Thermodynamic Integration. J. Chem. Phys. 1994, 100, 9025.

(35)

Aqvist, J.; Medina, C.; Samuelsson, J.-E. A New Method for Predicting Binding Affinity in Computer-Aided Drug Design. Protein Eng. 1994, 7, 385–391.

(36)

Hansson, T.; Marelius, J.; Aqvist, J. Ligand Binding Affinity Prediction by Linear Interaction Energy Methods. J. Comp. Aid. Mol. Des. 1998, 12, 27–35.

(37)

Almlöf, M.; Andér, M.; Aqvist, J. Energetics of Codon−Anticodon Recognition on the Small Ribosomal Subunit. Biochemistry 2007, 46, 200–209.

(38)

Aqvist, J.; Hansson, T. On the Validity of Electrostatic Linear Response in Polar Solvents. J. Phys. Chem. 1996, 100, 9512–9521.

(39)

Almlöf, M.; Aqvist, J.; Smalås, A. O.; Brandsdal, B. O. Probing the Effect of Point Mutations at Protein-Protein Interfaces with Free Energy Calculations. Biophys. J. 2006, 90, 433–442.

(40)

Florián, J.; Goodman, M. F.; Warshel, A. Theoretical Investigation of the Binding Free Energies and Key Substrate-Recognition Components of the Replication Fidelity of Human DNA Polymerase Β. J. Phys. Chem. B 2002, 106, 5739–5753.

(41)

Almlöf, M.; Carlsson, J.; Aqvist, J. Improving the Accuracy of the Linear Interaction Energy Method for Solvation Free Energies. J. Chem. Theory Comput. 2007, 3, 2162–2175.

(42)

Mackerell, A. D., Jr.; Wiorkiewicz-Kuczera, J.; Karplus, M. An All-Atom Empirical Energy Function for the Simulation of Nucleic Acids. J. Am. Chem. Soc. 1995, 117, 11946–11975.

(43)

MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S. A. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616.

(44)

Marelius, J.; Hansson, T.; Aqvist, J. Calculation of Ligand Binding Free Energies From Molecular Dynamics Simulations. Int. J. Quant. Chem. 1998, 69, 77–88.

(45)

Bjelic, S.; Aqvist, J. Catalysis and Linear Free Energy Relationships in Aspartic Proteases. Biochemistry 2006, 45, 7709–7723.

(46)

Andér, M.; Luzhkov, V. B.; Aqvist, J. Ligand Binding to the Voltage-Gated Kv1.5 Potassium Channel in the Open States--Docking and Computer Simulations of a Homology Model. Biophys. J. 2008, 94, 820–831.

(47)

Shamsudin Khan, Y.; Gutiérrez-de-Terán, H.; Boukharta, L.; Aqvist, J. Toward an Optimal 23

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 34

Docking and Free Energy Calculation Scheme in Ligand Design with Application to COX-1 Inhibitors. J. Chem. Inf. Model. 2014, 54, 1488–1499. (48)

Carlsson, J.; Boukharta, L.; Aqvist, J. Combining Docking, Molecular Dynamics and the Linear Interaction Energy Method to Predict Binding Modes and Affinities for NonNucleoside Inhibitors to HIV-1 Reverse Transcriptase. J. Med. Chem. 2008, 51, 2648–2656.

(49)

Mishra, S. K.; Sund, J.; Aqvist, J.; Koča, J. Computational Prediction of Monosaccharide Binding Free Energies to Lectins with Linear Interaction Energy Models. J. Comput. Chem. 2012, 33, 2340–2350.

(50)

Warshel, A.; Sharma, P. K.; Chu, Z. T.; Aqvist, J. Electrostatic Contributions to Binding of Transition State Analogues Can Be Very Different From the Corresponding Contributions to Catalysis: Phenolates Binding to the Oxyanion Hole of Ketosteroid Isomerase. Biochemistry 2007, 46, 1466–1476.

(51)

Simonson, T. Protein: Ligand Recognition: Simple Models for Electrostatic Effects. Curr. Pharm. Des. 2013, 19, 4241–4256.

(52)

Carlsson, J.; Aqvist, J. Calculations of Solute and Solvent Entropies From Molecular Dynamics Simulations. Phys. Chem. Chem. Phys. 2006, 8, 5385–5395.

(53)

Warshel, A. Energetics of Enzyme Catalysis. Proc. Natl. Acad. Sci. U.S.A. 1978, 75, 5250– 5254.

24

ACS Paragon Plus Environment

Page 25 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Table of Contents Image

25

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 34

Figure captions Figure 1. Overview of the aptamer region of the purine riboswitch showing the crystal structure of DAP bound to GRA (PDB code 2B57)5 and the solvated simulation sphere (light blue) used here for the free energy calculations.

Figure 2. Average structures from MD simulations of GRA binding to (A) DAP, (B) 2FA, (C) PUR, (D) ADE, (E) BrG, (F) ClG, (G) 2AP, (H) GUA and (I) HYP and GR binding to (J) GUA, (K) DAP, (L) HYP, (M) 2AP and (N) ADE. (O) Average MD structure of GRA binding to DAP superimposed on the initial RNA structure (PDB code 2XNW) with GRA residues are coloured green and GR residues magenta.

Figure 3. (A) Calculated relative binding free energies (in kcal/mol) for GRA and GR complexes versus experimental values, where each entry (i,j) denotes ∆Gbind (j) - ∆Gbind (i). The color coding is such that green corresponds to ligand pairs measured in Ref. 5, while pairs in blue were measured in Ref. 6 and red entries have one ligand measured in each of the experimental references (see text). Entries involving GUA and HYP in GRA and ADE in GR (no detectable experimental binding affinities) are colored in white. (B) Plot of the calculated versus experimental relative binding free energies for the 27 ligand pairs with measured affinities, with color coding as in (A) but with data points for PUR in black. The average s.e.m. for calculated binding free energies is 0.4 kcal/mol and the maximum s.e.m. is 0.8 kcal/mol.

Figure 4. Calculated versus experimental absolute binding free energies (in kcal/mol) obtained with (A) the standard LIE model and (B) the LIE-FEP model discussed in the text. Ligands binding to GRA are shown as blue diamonds and GR complexes as red squares (PUR is the uppermost point in both plots).

Figure 5. Plots of the U lel− s

l

interaction energies (blue diamonds) as a function of λ from TI

calculations of the electrostatic part of the hydration free energies for (A) a K+ ion, (B) galactose, (C) DAP and (D) GUA. The TI free energy is the area above the curve. Red dashed lines show the LIE approximation and green dashed lines show the LRA-approximation with b=0.5.

26

ACS Paragon Plus Environment

Page 27 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6. Plots of the U lel− s

The Journal of Physical Chemistry

l

interaction energies (blue diamonds) as a function of λ from TI

calculations of the electrostatic part of solvation free energies in GRA for (A) DAP and (B) GUA. Red dashed lines show the LIE approximation with optimal values of bcomplex. Average structures from MD simulations are shown for GRA binding to (C) DAP and (D) GUA together with the corresponding uncharged states of (E) DAP and (F) GUA.

27

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 28 of 34

Table 1. Average ligand-surrounding interaction energies (kcal/mol) and FEP calculations for turning on the electrostatic ligand-surrounding interactions for GRA ligands. GRA

Complex

Water

Complex

Water

Complex

∆Gel ∆Gel l −s l −s l −s l −s bcomplex Ligand DAP -32.08 ± 0.04 -40.04 ± 0.06 -11.27 ± 0.02 -38.73 ± 0.05 -24.34 -14.87 0.61 2FA -29.60 ± 0.41 -33.45 ± 0.30 -11.51 ± 0.04 -32.88 ± 0.08 -20.64 -13.12 0.62 PUR -26.61 ± 1.05 -31.49 ± 0.94 -9.93 ± 0.02 -29.78 ± 0.12 -20.33 -11.71 0.65 ADE -28.90 ± 0.38 -34.88 ± 0.24 -10.37 ± 0.03 -34.32 ± 0.06 -22.37 -13.33 0.64 ClG -34.06 ± 0.17 -36.02 ± 0.35 -12.80 ± 0.02 -34.35 ± 0.08 -22.59 -13.67 0.63 BrG -35.12 ± 0.17 -35.45 ± 0.36 -13.44 ± 0.02 -34.82 ± 0.06 -22.16 -13.68 0.63 2AP -28.23 ± 1.89 -36.28 ± 1.03 -10.62 ± 0.03 -34.03 ± 0.15 -23.20 -13.47 0.64 GUA -30.61 ± 0.27 -54.48 ± 0.81 -10.59 ± 0.04 -56.95 ± 0.16 -28.67 -23.98 0.53 HYP -27.38 ± 1.52 -46.36 ± 0.84 -10.63 ± 0.04 -45.86 ± 0.10 -22.56 -19.48 0.49 a c Standard LIE model, optimized γ=−4.32. LIE model of eq. 6, optimized γ=3.97. Experimental values from Refs. 5 and 6. U vdw

U el

U el

U vdw

FEP

FEP

Water

∆Gcalc

∆Gcalc

bwater

LIEa

LIE-FEPb

∆Gexpc

0.38 0.40 0.39 0.39 0.40 0.39 0.40 0.42 0.42

-8.63 -7.82 -8.06 -7.90 -8.86 -8.49 -8.46 -6.86 -7.12

-9.25 -6.80 -7.65 -8.40 -8.77 -8.41 -8.92 -4.32 -2.12

-10.77 -7.48 -5.28 -8.77 -8.51 -8.25 -9.15 n.d. n.d.

Table 2 Average ligand-surrounding interaction energies (kcal/mol) and FEP calculations for turning on the electrostatic ligand-surrounding interactions for GR ligands. GR

Complex

Water

∆Gcalc

∆Gcalc

∆Gel ∆Gel l −s l −s l −s l −s bcomplex Ligand GUA -29.92 ± 0.32 -59.09 ± 0.34 -10.59 ± 0.04 -56.95 ± 0.16 -36.35 -23.98 0.62 DAP -31.16 ± 0.30 -39.16 ± 1.49 -11.27 ± 0.02 -38.73 ± 0.05 -24.62 -14.87 0.63 HYP -28.25 ± 0.43 -47.42 ± 0.75 -10.63 ± 0.04 -45.86 ± 0.10 -28.52 -19.48 0.60 2AP -28.22 ± 0.38 -37.81 ± 1.17 -10.62 ± 0.03 -34.03 ± 0.15 -24.25 -13.47 0.64 ADE -27.36 ± 1.21 -31.12 ± 0.71 -10.37 ± 0.03 -34.32 ± 0.06 -20.25 -13.33 0.65 a c Standard LIE model, optimized γ=−4.56. LIE model of eq. 6, optimized γ=5.06. Experimental values from Refs. 5 and 6.

bwater

LIEa

LIE-FEPb

∆Gexpc

0.42 0.38 0.42 0.40 0.39

-8.93 -8.32 -8.40 -9.35 -6.24

-10.72 -8.27 -7.14 -8.88 -4.91

-11.64 -7.47 -8.48 -7.42 n.d.

Complex U vdw

Water U el

U vdw

Complex U el

Water

FEP

28

ACS Paragon Plus Environment

FEP

Page 29 of 34

Figure 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

The Journal of Physical Chemistry

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Figure 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

A

B DAP

Page 30 of 34

C 2FA

U22

PUR

U74

U51

D

F

E ADE

G

ClG

BrG

H

GUA

I

HYP

2AP

J

K

L DAP

GUA

HYP

C74

M

N

ADE

2AP

ACS Paragon Plus Environment

O

Page 31 of 34

The Journal of Physical Chemistry

Figure 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

A

GRA - calc

GRA - exp

GR - calc

GR - exp

B

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Figure 4

G(exp)

A PUR

-6 2AP

-8

2FA

DAP

BrG ClG

HYP ADE 2AP

-10 DAP GUA

-12 -12

-10

-8

-6

G(calc)

B G(exp)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

Page 32 of 34

PUR

-6 2AP

-8

DAP

BrG ClG

2FA HYP

ADE

2AP

-10 DAP GUA

-12 -12

-10

-8

ACS Paragon Plus Environment

-6

G(calc)

Page 33 of 34

The Journal of Physical Chemistry

Figure 5

A

B

K+ in water 0.2

0.4

0.6

0.8

0 -50 -100 LIE

-150

0.0

1.0

l (kcal/mol)

l (kcal/mol)

0.0

Galactose in water

LRA

-200

0.2

0.4

C

1.0

0.8

1.0

-40 -60 LIE

-80

LRA

l

D

DAP in water 0.0

l (kcal/mol)

0.8

-20

l

0.2

0.4

0.6

0.8

0 -20 -40

0.6

0

-100

LIE β = 0.38 LRA

-60

GUA in water 0.0

1.0

l (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

0.2

0.4

0.6

0 -20 -40

LIE β = 0.42 LRA

-60

l

l

ACS Paragon Plus Environment

The Journal of Physical Chemistry

DAP in GRA

A 0.0

l (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

0.2

0.4

0.6

0.8

0

-20 -40 -60

Page 34 of 34

GUA in GRA

B

LIE b=0.61

-80

0.0

1.0

l (kcal/mol)

Figure 6

0.2

0.4

-20 -40 -60

LIE b=0.53

-80

l

C

0.8

0

l

D DAP

GUA U22

U74

U51

E

0.6

F

ACS Paragon Plus Environment

1.0