pH-Dependent Transient Conformational States Control Optical

Feb 3, 2015 - Elena N. Laricheva†, Garrett B. Goh†, Alex Dickson†, and Charles L. ... Jinfeng Liu , Jason Swails , John Z. H. Zhang , Xiao He , ...
0 downloads 0 Views 55MB Size
Subscriber access provided by Yale University Library

Article

pH-dependent Transient Conformational States Control Optical Properties of Cyan Fluorescent Protein Elena N Laricheva, Garrett B. Goh, Alex Dickson, and Charles L. Brooks J. Am. Chem. Soc., Just Accepted Manuscript • Publication Date (Web): 03 Feb 2015 Downloaded from http://pubs.acs.org on February 4, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of the American Chemical Society is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

pH-Dependent Transient Conformational States Control Optical Properties in Cyan Fluorescent Protein Elena N. Laricheva,† Garrett B. Goh,† Alex Dickson,† and Charles L. Brooks, III*,†,‡ †



Department of Chemistry and Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States

ABSTRACT: A recently engineered mutant of cyan fluorescent protein (WasCFP) that exhibits pH-dependent absorption suggests that its tryptophan-based chromophore switches between neutral (protonated) and charged (deprotonated) states depending on external pH. At pH 8.1 the latter gives rise to green fluorescence as opposed to the cyan color of emission that is characteristic for the neutral form at low pH. Given the high energy cost of deprotonating the tryptophan at the indole nitrogen, this behavior is puzzling, even if the stabilizing effect of the V61K mutation in the proximity of the protonation/deprotonation site is considered. Due to its potential to open new avenues for the development of optical sensors and photoconvertible fluorescent proteins, a mechanistic understanding of how the charged state in WasCFP can possibly be stabilized is, thus, important. Attributed to the dynamic nature of proteins, such understanding often requires knowledge of the various conformations adopted, including transiently populated conformational states. Transient conformational states triggered by pH are of emerging interest, and have been shown to be important whenever ionizable groups interact with hydrophobic environments. Using a combination of the weighted-ensemble sampling method and explicit solvent constant pH molecular dynamics (CPHMDMSλD) simulations, we have identified a solvated transient state, characterized by a partially open β-barrel where the chromophore pKa of 6.8 is shifted by over 20 units from that in the closed form (6.8 and 31.7, respectively). This state contributes a small population at low pH (12% at pH 6.1), but becomes dominant at mildly basic conditions, contributing as much as 53% at pH 8.1. This pH-dependent population shift between neutral (at pH 6.1) and charged (at pH 8.1) forms is thus responsible for the observed absorption behavior of WasCFP. Our findings demonstrate the conditions necessary to stabilize the charged state of the WasCFP chromophore (namely, local solvation at the deprotonation site and a partial flexibility of the protein β-barrel structure) and provide the first evidence that transient conformational states can control optical properties of fluorescent proteins.

22 has

1 INTRODUCTION 2 Expanding

the palette of genetically encodable fluoresproteins (FPs) with spectral properties that can be 4 modulated by pH is a well-appreciated challenge due to 1–5 5 their wide applicability as non-invasive pH sensors and 6 optical highlighters for super-resolution imaging of living 6–9 7 cells. The majority of such proteins developed to date 8 belong to the green fluorescent protein (GFP) family and 9 owe their pH-sensitive optical behavior to a tyrosine10 based chromophore that can interconvert between the 11 neutral (protonated) and deprotonated (charged) states 12 depending on the hydrogen-bonding environment sur7 13 rounding its phenolic group. Rational design of new pH14 sensitive variants requires both (i) a fundamental under15 standing of how the proteins with tyrosine-based chro16 mophores function at the atomic level, as well as (ii) go17 ing beyond and looking at the FPs with chromophores 18 other than tyrosine as potential candidates (e.g. trypto19 phan or phenylalanine/histidine-based chromophores, as 20 in the case of cyan and blue fluorescent proteins). While a 21 second approach has long been overlooked, the first one 3 cent

been quite successful resulting in a number of useful sensors (e.g. pHluorins,3,5 phRed2) and optical high8,9 24 lighters (e.g. Kaede ). The efforts in this direction, how25 ever, have mostly been focused on targeting the residues 26 in the vicinity of the chromophore that affect its spectral 27 characteristics through electronic effects, and largely ne28 glected the importance of characterizing the conforma7 29 tional ensemble of the protein. 23 pH

30 In

recent years, a large body of evidence has emerged that understanding the mechanisms underly32 ing protein functions depends on our ability to character10–12 33 ize its dynamic ensemble. Due to the nature of con34 ventional biophysical techniques that primarily probe the 35 most stable protein conformers, our understanding has 36 long been limited to the information regarding highly 37 populated ground conformational states. However, such 38 states often represent only one of the functional forms, 39 and higher-energy physiologically-relevant conformers 40 can be transiently populated (~10% or less) when initiated 41 by external stimuli, such as substrate binding, pH chang12,13 42 es, or thermal fluctuations. While low-energy ground31 suggesting

1 ACS Paragon Plus Environment

Journal of the American Chemical Society

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

conformers residing at the bottom of the conformaenergy landscape are normally separated by very 3 small kinetic barriers and interconvert between one an4 other within pico- to nanoseconds, the barriers between 5 them and higher energy structures are larger and associ6 ated with micro- to millisecond timescale or longer. Re7 cent advances in relaxation dispersion NMR 11,14 8 spectroscopy and room temperature X-ray crystallog15 9 raphy have made the detection of such transient con10 formational states possible, demonstrating their ubiqui10 11 tous role in enzyme catalysis, protein folding,13,14 and 16 12 ligand binding. Transiently populated conformational 13 states triggered by pH are of particular importance since 14 pH regulates the biological activity of many proteins, and 15 the role of pH-dependent transient states is an emerging 16 discovery that has been recently reported to influence 17 18 17 membrane fusion, folding pathways, and, more gener19 18 ally, the dynamics of buried ionizable groups.

Page 2 of 17

1 state

2 tional

19 In

this paper, we demonstrate that pH-dependent transiconformational states can tune the absorption profile 20 21 of cyan fluorescent protein (CFP) – a blue-shifted vari22 ant of the green fluorescent protein (GFP) family used for 23 multicolor labeling and fluorescence resonance energy 24 transfer (FRET) applications – which to our knowledge is 25 the first precedent of such a mechanism. In particular, we 26 provide a theoretical explanation for the non-monotonic pH27 dependent absorption of a recently engineered CFP mutant 28 (WasCFP; see Figure 1A). While the vast majority of the 2,3,5 29 pH-sensitive fluorescent proteins reported to date ex30 hibit monotonic changes in optical signals (e.g. absorp31 tion, emission, excitation), the WasCFP mutant does not 32 conform to such a monotonic behavior. It reversibly in33 terconverts between cyan-emitting and green-emitting 34 forms, with the latter form dominant at pH 8.1 (and 21 35 25°C), above which the green signal drops. Part of this 36 behavior has been attributed to the deprotonation of the 37 tryptophan-based chromophore (Figure 1B) at mildly 38 basic pH, which is accompanied by a 60 nm batho39 chromic shift in absorption. 20 ent

40 Even

though the observed effect is a highly unusual sceas the pKa of an analogous indole is high (16.2 in 21 42 H2O; 21.0 in DMSO), Sarkisyan et al. have shown that a 43 synthetic CFP chromophore (hereon referred to as CRF), 44 which is a truncated version of that in the protein, under45 goes a similar pH-dependent absorption red-shift in both 46 protic and aprotic solvents, and its pKa is depressed to 47 12.4 due to the more efficient delocalization of the nega48 tive charge over the extended π-system (see Figure 1C). 41 nario

49

1. A: Structure of WasCFP showing Cα positions of residues (V61K, D148G, Y151N, L207Q). B: Structure 52 of Trp66-based WasCFP chromophore covalently bound to 53 β-barrel at positions showed with dashed lines (Leu64 and – 54 Val68). C: Deprotonation of CRF. CRF-H and CRF are neu55 tral (protonated) and charged (deprotonated) forms of the 56 synthetic chromophore, respectively. 50 Figure

51 mutated

57 The

same shift has been observed in a wild-type CFP varimCerulean denatured in 5M NaOH. In addition, anal59 ogous pH-dependent bathochromic shifts, attributed to 60 the deprotonation at phenolic oxygen, have been previ61 ously detected in various members of the GFP family with 62 tyrosine-based chromophores (e.g. yellow, red and green 22 63 fluorescent proteins). In WasCFP, a key V61K substitu64 tion positions a Lys residue in close proximity to the in65 dole nitrogen of the chromophore (see Figure 1B), and 66 this was proposed to stabilize its deprotonated state. 58 ant

67 The

pKa of the WasCFP chromophore, however, could not directly measured and the atomic level details of its 69 pH-dependent absorption remained unknown. Moreover, 70 no explanation for the non-monotonic optical properties 71 of WasCFP, specifically the signal drop in the green fluo72 rescent form above pH 8.1 has been proposed. . 68 be

To elucidate the mechanism of the unusual pHspectral behavior of WasCFP, we probed it 75 with a combination of two computational techniques: the 23 76 weighted-ensemble sampling method using a novel hy24 77 dration order parameter; and explicit solvent constant MSλD 25 78 pH molecular dynamics simulations (CPHMD ), 79 which has been used successfully to interrogate the pH19,26–28 80 dependent dynamics of several biological systems. 81 Among our key findings is the existence of a low pKa (6.8) 82 solvated state characterized by a partially open β-barrel, 83 which, in combination with a high pKa (31.7) closed con84 formation, governs the pH-dependent properties of 85 WasCFP. In particular, we have shown that even in the 86 presence of the stabilizing V61K mutation, the free energy 87 cost of deprotonating the chromophore is still high and 88 does not allow for the existence of the charged state of 89 WasCFP even at high pH. The protein has to adopt a par90 tially open conformation allowing a few water molecules 91 to access to the chromophore. This conformation is tran92 siently populated at low pH but becomes dominant under 93 mildly basic conditions. The pH-dependent shift between 94 the closed and open form is thus responsible for the ob95 served bathochromic shift in WasCFP. 73

74 dependent

2 ACS Paragon Plus Environment

Page 3 of 17 1 RESULTS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society AND DISCUSSION

2 First,

to determine the plausibility for WasCFP (hereon to as, simply, WAS) to exist in the charged depro4 tonated state, we modeled the absorption properties of 5 CRF at the TDDFT//B3LYP/6-31+G* level, both in gas 6 phase and in implicit solvent. As shown in Table 1, the 7 vertical excitation energy of the lowest-energy transition 8 (ground to first excited electronic state; S0S1) for the 21 9 neutral CRF is in good agreement with both experiment 29,30 10 and previous computations, and the direction of the 11 bathochromic shift between the neutral and anionic form 12 is qualitatively reproduced. We note that the latter is 13 quantitatively underestimated due to well-known system29,31 14 atic errors produced by TDDFT methods for anions. 3 referred

1. TDDFT//B3LYP/6-31+G* vertical excitation enerand oscillator strengths for the lowest-energy (S0S1) – a b 17 transition of the CRF-H and CRF : – in implicit solvent, 18 – in the gas phase 15 Table 16 gies

λamax (comp.), nm

λamax (exp.), nm

fS0S1

Neutral

391a (375 b)

400

0.7a (0.6b)

Charge d

415a (400b)

460

0.9a, b

the computed oscillator strengths support qualitative picture and demonstrate that the S0S1 22 transition for the charged form is stronger than that for 23 the neutral one, making plausible the existence of a 24 charged state in the CRF. 21 the

we constructed a model compound (RES; denoting model compound “residue”), which is an extended 27 CFP chromophore consisting of the CRF moiety covalent28 ly bound to Leu64 and Val68 (see Figure 1B and Figure 2) MSλD 29 to serve as a reference for our explicit solvent CPHMD 25 30 simulations. Using a model pKa of 12.7 (calculated using 31 thermodynamic integration based on the CRF), we initial32 ly computed the pKa values of the model wild type and 33 mutant peptides (WTP and V61KP, respectively) that con34 sist of 10 residues including а key position 61, before pro35 ceeding to the wild type and mutant proteins (WT and 36 WAS, respectively) – all using a thermodynamic cycle 37 depicted in Figure 2. 38

41 formations

shown in Figure 2, while an alchemical transformation CRF to RES barely alters the pKa of the titrating moiety, 49 perturbation by the peptide and protein environment 50 (WTP and WT) shifts its pKa by 1.1 and 39.4 units, respec51 tively. Such large pKa shifts have been reported when ion52 izable groups are transferred into a hydrophobic envi32 53 ronment, and demonstrates the sensitivity of the chro54 mophore pKa to the extent of local solvation at the proto55 nation site. In our case, the solvation is high in the pep56 tide and low inside the hydrophobic β-barrel of WAS, 57 which is not surprising considering that nature selected 58 the cylindrical β-barrel for fluorescent proteins in order to 59 prevent the fluorescence quenching by either water or 7 60 oxygen. Using the thermodynamic cycle in Figure 2 and MSλD 61 the pKa values computed using the CPHMD method, 62 we calculated that the positive charge of Lys61 introduced 63 in a close proximity to the indole group of RES stabilizes 64 V61KP by 5.4 kcal/mol with respect to WTP, and WAS – 65 by 27.9 kcal/mol relative to WT. This cost is also solva66 tion-dependent and leads to the downshift of the RES pKa 67 in a peptide by 4 units, while depressing the pKa in the 68 protein by 20.4. Even though our simulations clearly show 69 that positive charge in the vicinity of the protonation site 70 stabilizes the anionic form of RES, both neutral WT and 71 charged WAS species are extremely stable, with pKas of 72 52.1 and 31.7 that are comparable to those of the so-called 33 73 “super-bases” in a low dielectric environment. Therefore, 74 the deprotonation at the indole nitrogen does not happen 75 in such “closed” state conformations in the experimental 76 pH range of 6–10, and the computed pKa values do not 77 account for the observed pH-dependent absorption prop78 erties of WAS. 48 of

20 Nevertheless,

26 the

2. Thermodynamic cycle that shows alchemical transconsidered in this study. Cyan and yellow-colored 42 side chains correspond to V61 and K61 in WTP and V61KP 43 peptides, respectively. pKa values computed from MSλD 44 CPHMD simulations are highly elevated in both WT and 45 WAS and, thus, are not responsible the observed pH46 dependent absorption of WasCFP. 40 Figure

47 As

19

25 Next,

39

79 Numerous 80 teins,

studies of conformational plasticity of prohowever, have demonstrated the importance of

3 ACS Paragon Plus Environment

Journal of the American Chemical Society

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 characterizing

45 the

2 cluding

the dynamic ensemble of their states, inthose that are only transiently populated.10–13,16 3 Moreover, partially open, solvated pH-dependent transi4 ent states have been hypothesized to be of general im19 5 portance in systems with buried ionizable groups. How6 ever, capturing the transition between ground and transi7 ent conformational states often requires simulation time8 scales currently not accessible to the version of the 9 CPHMD used in our work. Therefore, guided by the ob10 servation of a significantly diminished pKa in a well11 solvated peptide vs. “dry” high-pKa closed conformation 12 of the protein, we chose a hydration of the chromophore 13 as our reaction coordinate and performed a search for an 14 alternative WAS conformation using the weighted23 15 ensemble sampling method that allows the escape from 16 deep local minima (high probability regions) and provides 17 enhanced sampling of low probability, transient states. 18 Guided by the observation of a significantly diminished 19 pKa in a well-solvated peptide vs. “dry” high-pKa closed 20 conformation of the protein, we chose a hydration of the 21 chromophore as our reaction coordinate Figure 3A shows 22 a sampling of WAS configuration space along a hydration 23 parameter (φ), which posits the existence of transiently 24 populated states, characterized by the partially open β25 barrel and local solvation at the protonation site. The pKa 26 values of the RES chromophore were subsequently com27 puted for four representative WAS structures, extracted 28 from the four regions in the histogram (φ=1.5-2.5; 5.5-8.0; 29 12.0-13.0 and 14.5-15.0), and structural changes influencing 30 its pKa were analyzed.

46 cent

Page 4 of 17

study of the oxygen diffusion pathway in red fluoresprotein, mCherry,35 facilitates local solvation (i.e. an 47 increase in the number of water molecules within a 7 Å radius) 48 at the indole nitrogen. Our calculations suggest that the 49 pKa of the chromophore can be as low as 6.8 for a struc50 ture with a high hydration parameter φ =15 that corre51 sponds to as many as 17 water molecules within 7Å of the 52 indole nitrogen of the chromophore (see Figure 3). Based 53 on the structural data provided from our simulations, the 54 residues on strands β7 and β10 undergo local unfolding 55 into an unstructured loop. Notably, as illustrated in Fig56 ure 4, residues 145–149 of β7 and 204-206 of β10 lose their 57 secondary structure (also refer to Figure S4 for Cα-Cα dis58 tances between β7 and β10 for all four structures in Figure 59 3). We note that these structural changes can be moni60 tored by examining the carbon (Cα) chemical shifts in 61 these regions as one titrates WasCFP from pH 8 to pH 6. 62 We predict that these shifts would report a change from a 63 mixed fraction of open and closed conformational states 64 at pH 8 to a predominantly closed conformational state at 65 pH 6. It is also worth noting that the chromophore in the 66 open state is still significantly more rigid than it is in solu67 tion (see Figure S6 ).

68

4. Distances between Cα-Cα atoms of residues in β7 β10 strands in the open state (dominant at pH=8.1) vs. 71 the closed state (dominant at pH=6.1). 69 Figure 70 and

72 As

WasCFP itself is a relatively recent construct, there is a of experimental data that measures its fluorescent 74 properties as a function of its dynamics. However, paral75 lels can be drawn between CFP and the related GFP, 76 which, despite their differences in the chromophore, do 77 share significant sequence similarity and may have similar 78 conformational properties. Interestingly, prior studies of 79 the unfolding of green fluorescent protein GFP have re80 vealed a stable fluorescent intermediate that retained 81 considerable secondary and tertiary geometry with dis82 placed β7 and β10 strands and access of the water mole36 83 cules to the chromophore, It has also been noted that 84 chromophore formation in fluorescent proteins occurs in 85 a partially structured intermediate state, although this 86 structure had reduced fluorescence. These experimental 87 observations for GFP suggest that partially open confor88 mations of the protein, similar to the transient state we 89 observed in our simulations, can exist. In addition, in two 90 companion papers, where the effect of pressure on the 91 quenching of fluorescence was examined, Weber and co37 92 workers and Krylov and co-workers38 reported that 93 mCherry and mStrawberry are both highly fluorescent at 73 lack

31

3. A: Probability distribution of the hydration paramof RES in the WAS protein shows transiently populated 34 states with large hydration parameters. B: Snapshots of the 35 chromophore environment in four different conformations of 36 WAS corresponding to hydration parameters φ=2, 7, 12, and 37 15. Number of water molecules within 7Å of nitrogen of the 38 chromophore (3, 8, 15, and 17, respectively) and the correMSλD 39 sponding pKa values computed using CPHMD simula40 tions are shown for each conformation. 32 Figure 33 eter

41 We

discovered that opening of the β7-β10 channel (up to backbone RMSD with respect to the crystal struc43 ture of WT; see Figures 3B), which has been previously 34 44 shown to be rather flexible in both wild type CFP and in 42 1.58Å

4 ACS Paragon Plus Environment

Page 5 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

1 standard

56 pH,

2 phore

pressure (0.1MPa) even though their chromoare partially solvated. In fact, the extent of local 3 solvation in our open state is similar to what the authors’ 4 computations predict for ambient conditions. In the con5 text of our findings, it raises the possibility that non6 ground state protein conformations may play a role in 7 modulating spectral properties of fluorescent proteins.

57 494

8 Lastly,

to explain the unusual pH-dependent absorption of WAS, we constructed a model based on the 10 assumption that WAS interconverts between the hydrat11 ed transient (open) state that we identified and its origi12 nal closed configuration. Previously, we developed a two13 state model to explain the protonation equilibrium of 19 14 SNase mutants with buried ionizable groups, which alopen 15 lows one to compute the fraction of open state (F ) as a 16 function of pH based on the ratio of the open and closed 17 state populations (Roc) deduced from the simulations. 9 behavior

F open = Roc / (Roc +1)

(1)

(

(

− pK

) (10 ) (10

− pK closed − pK open

−10 −10

− pK app

− pK app

) )

ratio (please see the derivation of (2) in Section S4 of SI) is calculated using the computed microscopic pKa 21 values of both forms (pKopen=6.8 and pKclosed=31.7) and the 22 apparent pKa value estimated based on available experi23 mental data (pKapp=7.8) (see Section S4 of the SI). 20 the

test the validity of the model, we compared the comFopen values with those that can be found directly 26 from experiment reported in ref. 21. As a basis for our 27 analysis, we used the pH-dependent absorption data for 21 28 WasCFP recorded by Sarkisyan et al. at 25°C (see Figures 29 S2a and S2b of the Supplemental material in the original 30 reference). The spectrum presented in their work differs 31 from a typical pH-dependent absorption profile where 32 one would expect for a mixture of conjugated acid and 33 base, where one form is largely dominant at low pH, while 34 the other one dominates under high pH conditions. In the 35 case of WasCFP, the signal at 494 nm, corresponding to 36 the charged (deprotonated) chromophore, grows with pH 37 up to pH=8.1, but then its intensity decreases – due to 38 titration of a nearby Lys61, as suggested by the authors of 39 the paper. In addition, the signal of the protonated form 40 consists of two peaks – a bonafide spectral feature of all 41 cyan fluorescent proteins, the origin of which is still a 20,29 42 subject of a great controversy. While the authors do 43 not mention any open states or fraction of the open states 44 in their work and discuss the signals at 436 and 494 nm as 45 arising from the protonated and deprotonated forms of 46 the chromophore in the same protein conformation, our 47 simulations suggest that chromophore can only exist in 48 its deprotonated form when protein conformation is par49 tially open allowing a few water molecules access to the 50 deprotonation site. Therefore, for the remainder of this 51 study, we will use the terms deprotonated and open states 52 interchangeably and express the fraction of open state, open 53 F , using the following information only: (1) extinction 54 coefficients at 436 and 494 nm (ε436, HA and ε494, A-); (2) ab55 sorbance of the protonated (closed) form at the lowest 25 puted

66

Abs436 = ε 436, HA × b × C HA

(3)

Abs494 = ε 494, A– × b ×C A–

(4)

67 The

sum of those concentrations represents the total conof all species, which remains constant at any

68 centration 69 pH:

C HA +C A– = CT

(5)

71 At

(2)

19 This

24 To

of expression for Fopen from experiment: The – 62 concentrations of protonated (HA) and deprotonated (A ) 63 forms can be written in terms of the absorbances of the 64 protein chromophore at the appropriate wavelengths (436 65 and 494 nm, respectively) using the Beer’s-Lambert law: 61 Derivation

70

10 open +10− pH [Open] where Roc = = − − pK [Closed] 10 closed +10− pH 18

Abs436, low pH (neglecting a small absorption signal at nm); and (3) absorbance of the deprotonated (open) 58 form, Abs494 – which is the only variable in our final equaopen 59 tion for F , presented below, that changes with pH (all 60 taken from the data in ref.21).

low pH, the protonated form dominates, so that CHA = Similarly, at high pH there is predominantly deproto– 73 nated form, and CA = CT. Knowing that, we can express 74 the absorbances at low pH and high pH as follows: 72 CT.

75

Abs436, low pH = ε 436, HA × b × CT

(6)

Abs494, high pH = ε 494, A – × b × CT

(7)

76 From

equation (6) we obtain CT and by substitution of (8) (7), the absorbance at high pH can be expressed in 78 the following way: ε – 79 Abs494, high pH = 494, A × Abs436, low pH (8) ε 436, HA 77 into

80 By

dividing Eq. (4) by Eq. (7), we obtain the fraction of deprotonated state, which varies with pH: C – Abs494 × ε 436, HA Abs494 F open = A = = (10) CT Abs494, high pH Abs436, low pH × ε 494, A –

81 the 82

83 Both 84 ment:

extinction coefficients are available from experiε 436, HA = 28000 M −1cm−1 and ε 494, A– = 51000 M −1cm−1

85 The

value of absorbance of pure HA at low pH, Abs436, low remains the same across the entire pH range, and the – 87 only parameter that varies is the absorbance of the A 88 form at 494 nm (Abs494), which can be calculated directly 89 from the intensity of the peak. Table 2 provides all the open 90 information necessary to estimate F at different pH 91 using the information from the pH-dependent absorption 92 spectrum at 25°C recorded by Sarkisyan et al.21 86 pH,

93 Table

2. Fopen derived from experimental data in ref. 21

ε 436, HA = 28000 M −1cm −1 −1

ε 494, A– = 51000 M cm Abs436, low pH =1.00

−1

open

pH

Abs494

F

6.1

0.16

6.5

0.28

0.15

7.0

0.54

0.30

0.09

8.1

1.00

0.55

8.5

0.91

0.50

9.5

0.75

0.41

9.9

0.45

0.25

5 ACS Paragon Plus Environment

Journal of the American Chemical Society 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2 Two-state

pKRES

[RES-H/Lys]open

Page 6 of 17 [RES/Lys]open

model: experiment vs. simulations

shown in Figure 5B, our two-state model provides a open 4 moderate correlation with experimental F values up to 5 pH 8.1. However, it does not fully describe the system at 6 higher pH (see Figure 5A), where Lys61 presumably 7 deprotonates, as mentioned above. Therefore, we intro8 duced a third state that accounts for a neutral Lys61 and 9 partially re-protonated chromophore at high pH, whose 10 pKa is approximated by that in the V61KP peptide found 11 from our simulations (which is also in agreement with the 12 experimentally suggested value of 9.8).

pKLys

3 As

[RES/Lys+]open

Kprot 40

Kdeprot pKclosed

[RES-H/Lys+]closed

[RES/Lys+]closed

41 Figure

6. Schematic illustration of the three-state model. and Kdeprot are equilibrium constants corresponding to 43 the pH-independent conformational transitions between 44 open and closed states. 42 Kprot

13 Three-state

45 Using

14 in

46 (ii),

model: experiment vs. simulations: The states our three-state model are defined as follows: (i) [RES+ 15 H/Lys ]closed – a closed state with neutral RES chromo16 phore and protonated Lys61 (pKclosed=31.7); (ii) + 17 [RES/Lys ]open – an open state with deprotonated chromo18 phore and protonated Lys61 (pKopen=6.8); and (iii) [RES19 H/Lys]open – an open state with re-protonated chromo20 phore (pKRES=9.8), and deprotonated Lys, whose pKa MSλD 21 (pKLys) was calculated from additional CPHMD simu22 lations where both the chromophore and Lys61 were sim23 ultaneously titrating.

pKopen

[RES-H/Lys+]open

the microscopic pKa calculated for states (i) and as well as the experimental apparent pKapp (7.8), we 47 calculated that if the pKa of the second state was shifted 48 to 9.8 (which is the value representing state (iii) in the 49 three-state model), it would result in a shift of pKapp to 50 10.8. Thus, the pKapp changes from 7.8 to 10.8 depending 51 on the identity of the second state, which is primarily 52 determined by the protonation state of Lys61. In other 53 words, we assumed that the protonation state of Lys61 is 54 coupled to the protonation state of the chromophore, 55 which is justified because that is the only titrating residue 56 in its vicinity. Hence, we interpolated the shift in the 57 pKapp and pKopen by accounting for the fraction of the un58 protonated Lys61. Thus, the corresponding pKapp and pKo59 pen are expected to vary as a function of pH and we can 60 substitute these pH-dependent terms to Eq. 2 to derive a 61 ratio of the open to closed states for the three-state mod62 el. This ratio is then approximated using a modified Eq. 63 (11) for Roc: 64

Roc = −

(10 (10

pH − pK open

− pK closed

+10 − pH )(10 − pKclosed −10 +10

− pH

)(10

pH − pK open

−10

pH − pK app pH − pK app

)

(11)

)

both pKpHopen and pKpHclosed are pH-dependent and 66 differ from their corresponding values in the two-state 67 model (pKopen=6.8 and pKclosed=31.7) by the delta term, 68 used to interpolate the pKa of the open conformation 69 from state (ii) to state (iii) by accounting for the coupling 70 between protonation states of the chromophore and 71 Lys61. 65 Where

24 open

5. pH-dependent F computed using two (A) and open 26 three (C) state model. Correlation with F estimated from 27 the pH-dependent absorption data is in (B) and (D), respec28 tively. 25 Figure

to the complexity of thermodynamic treatment of 30 three different conformations, two of which have residues 31 with strongly coupled protonation states (that of the 32 chromophore and that of the nearby Lys61), we simplified 33 the treatment of the three-state system into an effective 34 two-state system. In our analysis, this effective two-state 35 system corresponds to the bottom two arms of the ther36 modynamic cycle depicted in Figure 6, where the open 37 state conformations are collectively represented by states 38 (ii) and (iii) and where the protonation states of the 39 chromophore and Lys61 are tightly coupled.

72

pH pK open = pK open + ∆

pK

29 Due

pH app

= pK app + ∆

∆ = ( pK RES − pK open ) ×

(12) (13)

1

(14) 1+10 74 As shown in Figure 5D, our three-state model not only 75 improves the correlation with experimental observables 2 76 (with R =0.86), but also captures, both qualitatively and 77 quantitatively, the unusual bell-shaped pH-dependent 78 absorption profile of WAS (see Figure 5C). Thus, our re79 sults show that the open state, collectively represented by 80 the two locally solvated configurations and a partially 81 open β-barrel, is transiently populated and contributes a 82 small fraction at low pH (up to 12% at pH 6.1). This state 83 becomes dominant (as much as 53%) at mildly basic con84 ditions (pH=8.1), and gives rise to a strong absorption at 73

−( pH− pK Lys )

6 ACS Paragon Plus Environment

Page 7 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 494

Journal of the American Chemical Society nm (and, thus, green fluorescence), which then tiwith a pKa of 9.8 at higher pH values.

2 trates

3 Lastly,

our three-state model provides some useful ininto engineering pH-sensitive cyan fluorescent pro5 tein based on WasCFP. For example, to engineer a mutant 6 that does not possess the residual cyan fluorescence at 7 high pH, one may target the electrostatic environment in 8 the vicinity of Lys61 in a manner to prevent its re9 protonation at mildly basic conditions. Such a design 10 would suppress the population of the third state in our 11 three-state model by reducing the mechanism of the en12 gineered mutant to two interconverting states, and this 13 will increase the fraction of the open state with pKopen=6.8 14 (that only reaches a maximum of 53% at pH=8.1 in the 15 current WasCFP design). 4 sights

16 CONCLUSIONS 17 In

conclusion, we have applied a combination of the sampling method23 with a novel hy19 dration parameter and explicit solvent constant pH moMSλD 25 20 lecular dynamics (CPHMD ) to elucidate the origin 21 of the unusual non-monotonic pH-dependent absorption 22 behavior of a recently engineered CFP mutant that fea23 tures a pH-dependent shift between cyan and green fluo24 rescent forms. In earlier experimental observations, this 25 optical property was proposed to be controlled by the 26 charged anionic state of its tryptophan-containing chro21 27 mophore. Our calculations demonstrate that even in the 28 presence of the stabilizing V61K mutation, the free energy 29 cost of deprotonating the chromophore is still high and 30 does not allow for the existence of the charged state of 31 WasCFP even at basic pH. Instead, we propose the follow32 ing explanation: The distribution between two transiently 33 populated conformational states characterized by a par34 tially open β-barrel with local solvation around the chro35 mophore (that have pKa values of 6.8 and 9.8), relative to 36 the ground state crystallographic structure (that has pKa 37 of 31.7), is able to fully recapitulate both qualitatively and 38 quantitatively the unusual non-monotonic pH-dependent 39 properties of WasCFP. In this model, the open state is 40 transiently populated at low pH, but reaches a population 41 of 53% under mildly basic conditions, before losing its 42 dominance and reverting to a transient state under highly 43 basic conditions, and such mechanistic understanding 44 may be used to further engineer the pH-sensitive fluores45 cent properties of WasCFP. Therefore, our work not only 46 validates that tryptophan can be deprotonated in a bio47 logical system at mildly basic pH, but, more importantly, 48 shows that pH-dependent transient conformational states 49 are functionally relevant, and that they can tune the opti50 cal properties of fluorescent proteins. Such an outlook 51 will have implications in the rational design of fluorescent 52 proteins with pH-dependent optical properties. 18 weighted-ensemble

53 MATERIALS

AND METHODS

absorption properties. The geometries of neutral (protonated) and charged (deprotonated) syn56 thetic chromophore, referred to in this study as CRF, were 57 first optimized both in gas-phase and in implicit solvent 54 Calculating 55 the

58 of

high polarity (water) using a polarizable continuum (PCM)39 at the B3LYP/6-31+G* level, which has 60 previously been widely applied to model excitation prop61 erties of chromophores in various fluorescent 29,30,40 62 proteins. The calculations of vertical excitation en63 ergies and oscillator strengths for the five lowest energy 64 transitions were then performed on the optimized geome65 tries using the TDDFT method. Several scenarios consid66 ered during the optimization are reported in the Support67 ing Information (SI), Table S1. 59 model

input structures. The structures of a synthetic chromophore, along with its extended version RES 41 70 (see Figure 1B and Figure S1), were built using Avogadro. 71 Both residues were parameterized, as described in the SI 72 (see Section S2a). The RES structure, which is comprised 73 of the CRF covalently bound to Leu64 and Val68 of the 74 protein, was further selected as a model compound for MSλD 75 CPHMD simulations.25 The wild type peptide (WTP) 76 was built based on the mCerulean crystallographic struc20 77 ture (PDBID: 2WSO) using Leu60–Val61–Thr62–Thr63– 78 RES–Gln69–Cys70–Phe71 fragment. To generate a mutant 79 peptide (V61KP), Val in position 61 was mutated to Lys 42 80 using Pymol. Protein models were built in the same 81 fashion, with three additional substitutions (D148G, 82 Y151N, L207Q) incorporated in the mutant protein (WAS) 83 relative to the wild type (WT) form. Hydrogen atoms 43 84 were added using the HBUILD facility in CHARMM and 85 the structures were then solvated in a cubic box of explicit 44 86 TIP3P water molecules using convpdb.pl of the MMTSB + 87 toolset with a 12Å cutoff. The appropriate number of Na – 88 and Cl ions was added to maintain neutrality. The termi89 nal ends of both peptides and proteins were capped using 90 ACE and CT3 patches. 68 Building 69 CRF

91 An

additional patch was constructed to represent the (deprotonated) form of CRF. The same patch was MSλD 93 used in all subsequent TI and CPHMD calculations. 94 Atoms included in the patch were those that differed in 95 partial charges between protonated and deprotonated 96 forms of CRF (see Section S2, Figure S1). Those atoms 97 constitute the so-called titratable fragment. All corre98 sponding bond, angle and dihedral terms were explicitly 99 specified in the patch. A titratable residue was simulated 100 as a hybrid ligand with dual topology, built using the 101 BLOCK facility in CHARMM, and included all compo102 nents of the neutral and charged forms. Atoms not in103 cluded in the patch were considered environment atoms 104 and were the same for both neutral and charged forms. 92 charged

pKa of the model compound RES. In order obtain the pKa of RES, which was used as a model MSλD 107 compound in all CPHMD simulations, we performed 45 108 thermodynamic integration (TI) calculations, for the – 109 two deprotonation reactions (CRF-H/CRF and RES-H – 110 and RES ), as schematically shown in Figure 2 and S2. A 111 hybrid ligand containing patched atoms of the titratable 112 fragment was constructed for each case using the proce113 dure described above, and the simulations were run for 3 114 ns using nine discrete λ-windows (λ = 0.1, 0.2, 0.3, 0.4, 0.5, 115 0.6, 0.7, 0.8, 0.9) following a procedure similar to the one 46 116 previously used by Knight and Brooks. Distance re105 Calculating 106 to

7 ACS Paragon Plus Environment

Journal of the American Chemical Society

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 straints

57 rameterized

2 fragment

were applied to the pairs of atoms of the titratable using a force constant of 100 kcal/mol/Å2 with a 3 zero target distance between atoms in order to keep the 4 two (dual topology) fragments occupying the same vol5 ume. Each system was first minimized using two rounds 6 of 200 steps using Steepest Descent (SD) method, fol7 lowed by 2000 steps of Adopted Basis Newton-Raphson, 8 with weak harmonic restraints on heavy atoms (force 2 9 constants of 2.0 and 0.1 kcal/mol/Å for each run, respec10 tively) that were released prior to the heating stage. The 11 minimized system was then gradually heated from 100K 12 to 298K for 100 ps. During both the 1 ns-long equilibration 13 and 2 ns-long production runs the temperature was main14 tained at 298 K by coupling to a Langevin heat bath using –1 15 a friction coefficient of 10 ps . In all simulations hydrogen 16 heavy-atom bond lengths were constrained using the 47 17 SHAKE algorithm. The equations of motion were inte47 18 grated using the Leapfrog-Verlet algorithm with a time 19 step of 2 fs. A non-bonded cutoff of 12 Å was used with an 20 electrostatic force switching (“fswitch”) and a van der 21 Waals switching (“vswitch”) functions between 10 Å and 22 12 Å.

58 ly,

23 All

simulations were done within the BLOCK facility in All energy terms were scaled by λ, except for 25 the bonds, angles and dihedrals, which remained at full 26 strength regardless of the λ-value to retain physically rea27 sonable geometries. The post-processing of the trajecto28 ries was done in CHARMM and the free energies of RES 29 deprotonation for each system (ΔG and ΔGCRF) were 30 calculated by combining the dU/dλ values from all λ31 windows of the corresponding deprotonation reaction. RES 32 The pKa was then calculated based on the computed 33 ΔΔG as follows: 48 24 CHARMM.

34

∆∆G = ∆G RES − ∆G CRF (15) pK aRES = pK aCRF +

1 ∆∆G ln10kBT

using the same method as reported previousand were targeted to maintain both (i) good sampling 59 efficiency throughout the simulations (> 50 transitions 60 per ns; see Figure S2) and (ii) a high fraction of physical 61 ligand (~0.8). 62 To

facilitate the convergence of sampling of protonation we have also employed the pH-based replica ex64 change protocol, which allows the exchange between rep65 licas representing different pH conditions. The pKa values 66 for each relevant structure were computed by combining unprot 67 the populations of the unprotonated (N ) and protoprot 68 nated (N ) states in multiple runs and by fitting their unprot 69 ratio (S ) to the modified Henderson-Hasselbalch 70 equation: 63 states,

1 (17) 1+10−( pH − pK a ) 72 The protocol described above was applied to compute the 73 pKa values of the wild type and mutant peptides (WTP 74 and V61KP, respectively), as well as wild type and mutant 75 proteins (WT and WAS, respectively). The pH range var76 ied depending on the system: while the wild-type and 77 mutant peptides (WTP and V61KP, respectively), were 78 titrated in the pH range 1 to 16 (with an increment of 1), 79 the high-pKa closed wild-type and mutant protein con80 formations (WT and WAS) were sampled across the ex81 tended pH range (1 to 56 and 1 to 40, respectively). The 82 pKa of the open state was calculated using 16 pH replicas 83 (1 to 16). In all cases the calibrated RES with a pKa=12.7 84 was used as a model compound, and peptide/protein en85 vironments were assumed to perturb the pKa via non86 bonded interactions only. The full thermodynamic cycle 87 considered in this study can be found in Figure 2. 71

S unprot ( pH ) =

transiently populated states. The weighted algorithm (WE)23 enhances sampling of transi90 ently populated states by defining a set of regions, and 91 encouraging equal sampling in each region. This is 92 achieved by evolving a group of trajectories (replicas) 93 forward in time (in parallel) and periodically redistrib94 uting trajectories between the regions using cloning and 95 merging steps. To ensure physical sampling, each trajec96 tory has a statistical probability (or weight, wi) associated 97 with it such that the sum of all weights equals to one. 98 During cloning, weights of the parent trajectory are divid99 ed equally amongst the clones. When two trajectories A 100 and B are merged, the resulting replica has a weight of 101 wA+wB, and it takes on the structure of replica A with 102 probability wA/(wA+wB), and otherwise takes on the struc103 ture of replica B. 88 Sampling 89 ensemble

(16)

of the CPHMDMSλD method. In our explicit36 solvent constant pH molecular dynamics approach MSλD 25 37 (CPHMD ), the titration coordinate is represented by 38 λ, which serves as a scaling factor for each titratable resi39 due α, and is propagated continuously between the physi40 cally relevant protonated (λα,1 = 1) and unprotonated (λα,1 = MSλD 41 0) states. In CPHMD the dynamics of the λ-variable is 42 described using the multi-site λ-dynamics approach 49 43 (MSλD) that enables a coupling between λ and the MSλD 44 atomic coordinates. As such, in CPHMD , titratable 45 residues are model compounds perturbed by the protein 46 environment via non-bonded interactions. For more deMSλD 47 tails about the theory behind CPHMD simulation we 25,50 48 refer our readers to the following references. 35 Details

simulation setup for CPHMDMSλD simulations is simi45 50 lar to the one used in the TI (see above) . Only those 51 aspects that are different are reported below. In particuNexp 52 lar, the BLOCK facility was used with the λ functional 53 form of λ (FNEX) and a coefficient of 5.5. A fictitious mass 2 54 of 12 amuÅ was assigned to each θα value. The threshold 55 value for assigning λα,i = 1 was λα,i ≥ 0.8. The values of the fixed 56 F and Fvar biasing potentials (see Table S3) were pa49 The

Page 8 of 17

104 Here

the number of replicas is constant at 48, and regions defined using a novel hydration parameter, φ, defined 106 as follows: 105 are

107

ϕ = ∑ pi

(18)

i

108 Where 109 fined

the sum is over all water molecules i, and pi is deas:

8 ACS Paragon Plus Environment

Page 9 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Journal of the American Chemical Society

 if di < 5Å  1; pi =  1− (di − 5) / 2; if 5Å< di < 7Å (19)  if di > 7Å  0;

50

Ryazantsev for useful discussions of related spectroscopic issues.

51

REFERENCES

49

Wang, T.-M.; Holzhausen, L. C.; Kramer, R. H. Nat. 2014, 17, 262–268. 54 (2) Tantama, M.; Hung, Y. P.; Yellen, G. J. Am. Chem. Soc. 55 2011, 133, 10034–10037. 56 (3) Ashby, M. C.; Ibaraki, K.; Henley, J. M. Trends 57 Neurosci. 2004, 27, 257–261. 58 (4) Llopis, J.; McCaffery, J. M.; Miyawaki, A.; Farquhar, M. 59 G.; Tsien, R. Y. Proc. Natl. Acad. Sci. USA 1998, 95, 6803–6808. 60 (5) Miesenbock, G.; De Angelis, D. A.; Rothman, J. E. 61 Nature 1998, 394, 192–195. 62 (6) Shcherbakova, D. M.; Sengupta, P.; Lippincott63 Schwartz, J.; Verkhusha, V. V. Annu. Rev. Biophys. 2014, 43, 303– 64 329. 65 (7) Chudakov, D. M.; Matz, M. V; Lukyanov, S.; Lukyanov, 66 K. A. Physiol. Rev. 2010, 90, 1103–1163. 67 (8) Hatta, K.; Tsujii, H.; Omura, T. Nat. Protoc. 2006, 1, 68 960–967. 69 (9) Ando, R.; Hama, H.; Yamamoto-Hino, M.; Mizuno, H.; 70 Miyawaki, A. Proc. Natl. Acad. Sci. USA 2002, 99, 12651–12656. 71 (10) Sekhar, A.; Kay, L. E. Proc. Natl. Acad. Sci. USA 2013, 72 110, 12867–12874. 73 (11) Baldwin, A. J.; Kay, L. E. Nat. Chem. Biol. 2009, 5, 808– 74 814. 75 (12) Korzhnev, D. M. J. Mol. Biol. 2013, 425, 17–18. 76 (13) Korzhnev, D. M.; Religa, T. L.; Banachewicz, W.; 77 Fersht, A. R.; Kay, L. E. Sci. 2010, 329 , 1312–1316. 78 (14) Vallurupalli, P.; Hansen, D. F.; Stollar, E.; Meirovitch, 79 E.; Kay, L. E. Proc. Natl. Acad. Sci. USA 2007, 104, 18473–18477. 80 (15) Fraser, J. S.; van den Bedem, H.; Samelson, A. J.; Lang, 81 P. T.; Holton, J. M.; Echols, N.; Alber, T. Proc. Natl. Acad. Sci. 82 USA 2011, 108, 16247–16252. 83 (16) Tzeng, S.-R.; Kalodimos, C. G. Nat. Chem. Biol. 2013, 9, 84 462–465. 85 (17) Lorieau, J. L.; Louis, J. M.; Schwieters, C. D.; Bax, A. 86 Proc. Natl. Acad. Sci. USA 2012, 109, 19994–19999. 87 (18) Long, D.; Bouvignies, G.; Kay, L. E. Proc. Natl. Acad. 88 Sci. USA 2014. 89 (19) Goh, G. B.; Laricheva, E. N.; Brooks, C. L. J. Am. Chem. 90 Soc. 2014. 91 (20) Lelimousin, M.; Noirclerc-Savoye, M.; Lazareno-Saez, 92 C.; Paetzold, B.; Le Vot, S.; Chazal, R.; Macheboeuf, P.; Field, M. 93 J.; Bourgeois, D.; Royant, A. Biochemistry 2009, 48, 10038–10046. 94 (21) Sarkisyan, K. S.; Yampolsky, I. V; Solntsev, K. M.; 95 Lukyanov, S. A.; Lukyanov, K. A.; Mishin, A. S. Sci. Rep. 2012, 2, 96 1–5. 97 (22) Shaner, N. C.; Campbell, R. E.; Steinbach, P. A.; 98 Giepmans, B. N. G.; Palmer, A. E.; Tsien, R. Y. Nat Biotech 2004, 99 22, 1567–1572. 100 (23) Huber, G. A.; Kim, S. Biophys. J. 1996, 70, 97–110. 101 (24) Dickson, A.; Warmflash, A.; Dinner, A. R. J. Chem. 102 Phys. 2009, 131, -. 103 (25) Goh, G. B.; Hulbert, B. S.; Zhou, H.; Brooks, C. L. 104 Proteins: Struct., Funct., Bioinf. 2014, DOI: 10.1002/prot.24499. 105 (26) Goh, G. B.; Knight, J. L.; Brooks, C. L. J. Phys. Chem. 106 Lett. 2013, 4, 760–766. 107 (27) Goh, G. B.; Knight, J. L.; Brooks, C. L. J. Chem. Theory 108 Comput 2013, 9, 935–943. 109 (28) Nikolova, E. N.; Goh, G. B.; Brooks, C. L.; Al-Hashimi, 110 H. M. J. Am. Chem. Soc. 2013, 135, 6766–6769. 111 (29) Demachy, I.; Ridard, J.; Laguitton-Pasquier, H.; 112 Durnerin, E.; Vallverdu, G.; Archirel, P.; Lévy, B. J. Phys. Chem. B 113 2005, 109, 24121–24133. 52

2 Here,

di is the closest distance from atom N28 of the 3 chromophore (RES) to an atom in the water molecule i. 4 Regions were defined along this parameter with a spacing 5 of 0.2, and were defined dynamically over the course of 6 the WE simulation. A “cycle” consists of 20 ps of sam7 pling for all 48 walkers followed by cloning and merging 8 steps, and the simulation was continued for 490 cycles, at 9 which point region discovery had slowed almost to a halt. 10 The aggregate simulation time employed is thus 9.8 ns 11 per walker, or 470 ns. Sampling was performed using 12 CHARMM, implemented on graphics processing units 51 13 using OpenMM5.2. Otherwise the simulation is imple14 mented as specified above. 15 Before

computing the histogram of WAS in Figure 3, a is computed for each sampling region using a ma17 trix-based weight balancing algorithm, which finds the 18 vector of region weights (R) by solving the equation TR = 24 19 R, where T is the transition matrix. The histogram is 20 then built using a list of replica weights and hydration 21 parameter values collected during the WE simulation, 22 where the replica weights are multiplied by factor m, 23 which is defined as: 16 weight

24

mi =

ri ∑ wA

(20)

εi

25 Where

ri is the region weight determined by the matrix and the denominator is the sum of the weights 27 of all replicas in the list that are in region i. 26 equation,

obtain starting structures for subsequent CPHMDMSλD 29 simulations, a set of regions of the histogram with signifi30 cant probabilities (p > 0.001) is chosen that represents a 31 spectrum of hydration values: 1.5-2.5, 5.5-8.0, 12-13 and 32 14.5-15. We then use as a starting point the structure rec33 orded with the largest replica weight. 28 To

34

ASSOCIATED CONTENT

35

Supporting Information

Details of a computational protocol. This material is available free of charge via the Internet at 38 http://pubs.acs.org 36 37

39

AUTHOR INFORMATION

40

[email protected]

41 42

Notes

43

The authors declare no competing financial interests.

44

ACKNOWLEDGMENT

This work was supported by grants from NIH (GM037554 and GM057513), and the Howard Hughes 47 Medical Institute (HHMI International Student Re48 search Fellowship). E.N.L. would like to thank Dr. Mike 45 46

(1)

53 Neurosci.

9 ACS Paragon Plus Environment

Journal of the American Chemical Society

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(30) Nifosí, R.; Amat, P.; Tozzini, V. J. Comput. Chem. 2007, 2366–2377. 3 (31) Wanko, M.; Hoffmann, M.; Strodel, P.; Koslowski, A.; 4 Thiel, W.; Neese, F.; Frauenheim, T.; Elstner, M. J. Phys. Chem. B 5 2005, 109, 3606–3615. 6 (32) Kütt, A.; Leito, I.; Kaljurand, I.; Sooväli, L.; Vlasov, V. 7 M.; Yagupolskii, L. M.; Koppel, I. A. J. Org. Chem. 2006, 71, 2829– 8 2838. 9 (33) Vazdar, K.; Kunetskiy, R.; Saame, J.; Kaupmees, K.; 10 Leito, I.; Jahn, U. Angew. Chem., Int. Ed. 2014, 53, 1435–1438. 11 (34) Goedhart, J.; von Stetten, D.; Noirclerc-Savoye, M.; 12 Lelimousin, M.; Joosen, L.; Hink, M. A.; van Weeren, L.; Gadella, 13 T. W. J.; Royant, A. Nat. Commun. 2012, 3, 751. 14 (35) Regmi, C. K.; Bhandari, Y. R.; Gerstman, B. S.; 15 Chapagain, P. P. J. Phys. Chem. B 2013, 117, 2247–2253. 16 (36) Huang, J.; Craggs, T. D.; Christodoulou, J.; Jackson, S. 17 E. J. Mol. Biol. 2007, 370, 356–371. 18 (37) Pozzi, E. A.; Schwall, L. R.; Jimenez, R.; Weber, J. M. J. 19 Phys. Chem. B 2012, 116, 10311–10316. 20 (38) Laurent, A. D.; Mironov, V. A.; Chapagain, P. P.; 21 Nemukhin, A. V; Krylov, A. I. J. Phys. Chem. B 2012, 116, 12426– 22 12440. 23 (39) Tomasi, J.; Mennucci, B.; Cammi, R. Chem. Rev. 2005, 24 105, 2999–3094. 25 (40) Nemukhin, A. V; Topol, I. A.; Burt, S. K. J. Chem. 26 Theory Comput. 2005, 2, 292–299. 27 (41) Marcus D Hanwell, Donald E Curtis, David C Lonie, 28 Tim Vandermeersch, E. Z. and G. R. H. J. Cheminform. 2012, 4, 1– 29 17. 30 (42) The PyMOL Molecular Graphics System, Version 31 1.2r3pre. 32 (43) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, 33 D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4, 34 187–217. 35 (44) Feig, M.; Karanicolas, J.; Brooks III, C. L. J. Mol. Graph. 36 Model. 2004, 22, 377–395. 37 (45) Mezei, M.; Beveridge, D. L. Ann. N. Y. Acad. Sci. 1986, 38 482, 1–23. 39 (46) Knight, J. L.; Brooks, C. L. J. Comput. Chem. 2009, 30, 40 1692–1700.

(47)

1

41

2 28,

42 341.

Page 10 of 17

Mezei, M.; Beveridge, D. L. J. Comp. Phys. 1977, 23, 327–

(48) Brooks, B. R.; Brooks, C. L.; Mackerell, A. D.; Nilsson, Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; 45 Boresch, S.; Caflisch, A.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, 46 M.; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; 47 Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; 48 Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; 49 Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M. J. 50 Comput. Chem. 2009, 30, 1545–1614. 51 (49) Knight, J. L.; Brooks, C. L. J. Chem. Theory Comput. 52 2011, 7, 2728–2739. 53 (50) Goh, G. B.; Knight, J. L.; Brooks, C. L. J. Chem. Theory 54 Comput. 2011, 8, 36–46. 55 (51) Eastman, P.; Friedrichs, M. S.; Chodera, J. D.; Radmer, 56 R. J.; Bruns, C. M.; Ku, J. P.; Beauchamp, K. A.; Lane, T. J.; Wang, 57 L.-P.; Shukla, D.; Tye, T.; Houston, M.; Stich, T.; Klein, C.; Shirts, 58 M. R.; Pande, V. S. J. Chem. Theory Comput. 2012, 9, 461–469. 43

44 L.;

59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74

75

76

Insert Table of Contents artwork here

10 ACS Paragon Plus Environment

B

A

Leu64

Page Journal 11 of of 17the American Chemical Society Lys61

1 2 3 4 5 6 7 8

Val68

151

148 61 207

Trp66

C

pKa=12.4

HN

ACS Paragon Plus Environment O N

H3 C

N

CH3

CRF-H

–N

O

N H3 C

N

CH3

CRF–

pKa = 12.4

Journal of the American Chemical Page Society 12 of 17 CRF-H

1 2 3 4 5 6 7 8 9 10 11 V61KP-H 12 13 WAS-H 14

CRF– pKa = 12.7

RES-H

RES–

pKa = 13.8

pKa = 9.8

V61KP–

WTP-H

WAS–

WT-H

ACS Paragon Plus Environment pKa = 31.7

pKa = 52.1

WTP–

WT–

0.1

Probability

Page Journal 13 of of 17the American Chemical Society 0.01 0.001

1 0.0001 2 =2 3 1e-05 pK =18.7 0 2 4 6 8 10 12 14 16 4 Hydration parameter (φ) 5 6 7 8 9 ACS Paragon Plus Environment 10 =7 =12 =15 11 pK =10.5 pK =8.7 pK =6.8 a

a

a

a

C -C Society distance, Å 17 Journal of the American Chemical Page 14 of 7 strand 10 strand Closed state, pH 6.1

1 2 3 4 5 6 β7 7 β10 8

Open state, pH 8.1

Ala 145

Ala 206

4.8

7.9

Ile 146

Ser 205

5.5

8.2

Ser 147

Gln 204

4.3

7.7

Gly 148

Thr 203

5.6

8.8

Asn 149

Ser 202

4.6

5.6

Val 150

Leu 201

4.9

5.5

Asn 151Plus Tyr 200 4.4 ACS Paragon Environment

4.3

Leu 152

His 199

5.5

6.0

Thr 153

Asn 198

4.8

5.7

A

1.0

comp.

0.8 0.6 0.4

1.0

0.6 0.4 0.2

0.0

0.0

6

7

8

pH

9

10

D

exp. comp.

0.5

Calc. Fopen

0.4 0.3 0.2

0.4

pH

10

Exp. Fopen

0.6

R2=0.86

0.2

0.0

9

0.4

0.3

0.0

8

0.2

0.5

0.1 7

0.0

0.6

0.1 6

R2=0.53

0.8

0.2

C 0.6

Fopen

1.2

Journal of the American Chemical Society

Calc. Fopen

Fopen

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

B

exp.

Page 15 of 17

0.0

ACS Paragon Plus Environment

0.2

Exp. F

0.4

open

0.6

pKRES

Journal of the American Chemical Page Society 16 of 17 [RES-H/Lys] [RES/Lys] open open 1 pKLys 2 3 pKopen + 4 [RES-H/Lys ]open [RES/Lys+]open 5 6 7 Kprot K ACS Paragon Plus Environment deprot 8 9

+] [RES-H/Lys 10 closed

pKclosed

[RES/Lys+]closed

pH 8.1 ca. 53%

pH 6.1 ca. 13%

urnal Page of17 theofAmerican 17 Chemical Socie WE + CPHMDMSλD

1 2ACS Paragon Plus Environment 3 4 λ = 437 nm λ = 494 nm λ = 505 nm 5 λ = 477 nm a

a

fl

fl