Unraveling Energy and Dynamics Determinants to ... - ACS Publications

Mar 15, 2017 - share common three-dimensional folds is the key factor that drives protein evolvability. The ability to distinguish the parts of homolo...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

Unraveling energy and dynamics determinants to interpret protein functional plasticity: the limonene-1,2-epoxide-hydrolases case study Silvia Rinaldi, Alessandro Gori, Celeste Annovazzi, Erica Elisa Ferrandi, Daniela Monti, and Giorgio Colombo J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.6b00504 • Publication Date (Web): 15 Mar 2017 Downloaded from http://pubs.acs.org on March 18, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1

Unraveling energy and dynamics determinants to

2

interpret protein functional plasticity: the limonene-

3

1,2-epoxide-hydrolase case study

4

Silvia Rinaldi, Alessandro Gori, Celeste Annovazzi, Erica E. Ferrandi, Daniela Monti, and

5

Giorgio Colombo*.

6

Istituto di Chimica del Riconoscimento Molecolare, C.N.R., Via Mario Bianco 9, 20131 Milano,

7

Italy

8

*Corresponding author: Giorgio Colombo, Istituto di Chimica del Riconoscimento

9

Molecolare, CNR; via Mario Bianco 9, 20131 Milano, Italy. E-mail: [email protected].

10

Tel: +39-02-28500031, Fax: +39-02-28901239.

11 12

ABSTRACT

13

The balance between structural stability and functional plasticity in proteins that share common

14

three-dimensional folds is the key factor that drives protein evolvability. The ability to

15

distinguish the parts of homologous proteins that underlie common structural organization

16

patterns from the parts acting as regulatory modules that can sustain modifications in response to

17

evolutionary pressure may provide fundamental insights for understanding sequence-structure-

ACS Paragon Plus Environment

1

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 30

18

dynamics relationships. In applicative terms, this would help develop rational protein design

19

methods. Herein, we apply recently developed computational methods, validated by

20

experimental tests, to address these questions in a set of homologous enzymes representative of

21

the limonene-1,2-epoxide-hydrolase family (LEH) characterized by different stabilities, namely

22

Rhodococcus erythropolis-LEH (Re-LEH), Tomks-LEH, CH55-LEH and the two thermostable

23

Re-LEH variants Re-LEH-F1b and Re-LEH-P. Our results show that these enzymes, despite

24

significant sequence variations, exploit a few highly conserved stabilization determinants to

25

guarantee structural stability linked to biological functionality. Multiple sequence analysis shows

26

that these key elements are also shared by a larger set of LEHs structural homologues, despite

27

very low sequence identity and functional diversity. In this framework, stabilizing elements that

28

we hypothesize to have an accessory role are characterized by a lower degree of sequence

29

identity and higher mutability. We suggests that our approach can be successfully used to

30

pinpoint the distinctive energy fingerprint a class of proteins as well as to locate those

31

modulators whose modification could be exploited to tune protein stability and dynamic

32

properties.

33 34

Introduction

35

The properties of proteins are ultimately determined by the linear sequences of amino acids

36

that evolution selected to encode for thermodynamically stable functional three-dimensional

37

arrangements.1 Indeed, folding sequences represent a small subset of the large space of all

38

possible candidates that can populate a structure performing a certain function.2 The

39

relationships between sequence and structure have been extensively investigated by analyzing

ACS Paragon Plus Environment

2

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

40

the detailed physico-chemical determinants of the folding and stability of several model

41

proteins.3–5

42

Theoretical and experimental analyses have indicated that stability in relation to a certain

43

function (catalysis, ligand binding, transport etc…) is the key factor in selecting a certain

44

structure. In other words, a fold is evolutionary fit as long as it may be stable enough to allow the

45

protein to perform the desired biochemical function.6,7 Such protein stability should in principle

46

confer tolerance to (a certain number of) mutations and as a consequence may be instrumental to

47

protein evolvability and adaptation to different conditions. Importantly, the accumulation of

48

mutations while maintaining structural stability may result in entirely new functions.8–10

49

Understanding which parts of a protein contribute the most to stability still represents an

50

important open issue for both practical and fundamental issues. From the practical point of view,

51

the ability to introduce mutations able to modulate the stability of a protein while maintaining a

52

certain 3D organization can impact on the development of new catalysts that carry out non-

53

natural reactions or are particularly (un)stable in specific conditions. From the fundamental point

54

of view, the determination of the necessary requirements of stabilization can help improve our

55

understanding of how sequence differences may modulate functionally oriented properties, while

56

preserving a certain 3D organization. This latter aspect is strictly connected to the regulation of

57

conformational plasticity, whereby the fine-tuning of a certain function often depends on the

58

coexistence and selection of different dynamic states.11

59 60 61

In this paper, we set out to address various aspects of this problem by analyzing and comparing the atomistic simulations of 3 recently crystallized, structurally homologous representatives of

ACS Paragon Plus Environment

3

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 30

62

the limonene-1,2-epoxide-hydrolase family (LEH), namely Rhodococcus erythropolis-LEH (Re-

63

LEH), Tomks-LEH, CH55-LEH. These enzymes belong to the epoxide hydrolases (EH)

64

superfamily, catalyzing the hydrolysis of an oxirane ring by water addition. Most of EHs share

65

high sequence and structure similarity, exhibiting a classical α/β-hydrolase fold.12,13 Recently

66

new EH members have been discovered showing atypical reactivity and structural properties.

67

This family was named according to the natural substrate (limone-1,2-epoxide) of the first

68

isolated member, the Rhodococcus erythropolis-LEH.14 Re-LEH can have several attractive

69

applications in industrial synthesis, showing sequential and enantio-convergent conversion of

70

different isomers.15 In order to obtain enzymes with improved stability, a metagenomics

71

approach was recently exploited to identify new LEHs. CH55-LEH and Tomsk-LEH were

72

isolated in hot springs from China and Russia respectively, both collected at moderate high

73

temperatures (46° and 55°). Indeed both Tomsk-LEH and CH55-LEH show higher temperature

74

optima than Re-LEH,16 while from a structural point of view, they share the Re-LEH typical 3D

75

fold. The three enzymes form a stable homodimeric organization, where each monomer features

76

a highly curved six-stranded mixed β-sheet, with four α-helices packed onto it to create a deep

77

cavity. Despite the structural similarity, the three LEHs display low sequence homology: Tomsk-

78

LEH and CH55-LEH show 25% and 31% sequence identity with Re-LEH respectively, whereas

79

they share 48% identity between them.

80

Therefore, the three LEHs represent an interesting case study, where the balance between a

81

conserved 3D fold and sequence differences tunes protein properties. Understanding this

82

relationship could contribute to gain insight in the functional significance of protein structural

83

stability as well as to designing improved variants for industrial applications. To further test the

84

reliability of our approach, two highly thermostable variants of Re-LEH were included in the

ACS Paragon Plus Environment

4

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

85

study. Re-LEH-P and Re-LEH-F1b are two mutant enzymes obtained by the FRESCO approach

86

in the Janssen lab,17 where the authors could increase the apparent melting temperature by 20 °C

87

and 35 °C respectively by means of a limited number of point mutations and the introduction of

88

four additional disulfide bonds that stabilize the dimer in the case of Re-LEH-F1b.

89

Hence our aim is in particular to identify the minimal, shared key determinants of LEH

90

structural stabilization by the detection of common hotspot residues that may underlie the

91

observed differences in this family of enzymes. We then attempt to exploit this information to

92

define a broader link between sequence differences, the onset of different dynamic/plasticity

93

properties and thermostability.

94

To reach this goal, we have applied methods for the analysis of protein dynamics and

95

energetics recently developed by us.18–20 The work presented here serves also as a test of the

96

ability of such computational methods to provide information that can be transferred to the

97

experimental realm. Briefly, sets of common residues hypothesized to be fundamental for

98

controlling the plasticity and stability are identified. First, different rigidity/flexibility regimes

99

are correlated to differences in enzymes’ thermophilicity. Next, comparative analysis of the

100

distribution of internal energies pinpoints a limited number of interactions as the key

101

determinants (hotspots) of structural stabilization, which are strongly conserved in the different

102

members of the enzyme family. On the basis of this evidence, we suggest that these hotspots act

103

as scaffolds upon which the stable 3D arrangement of the enzyme is built, while the rest of the

104

sequence is modified to different extents to modulate activity and adaptability. The results of

105

computational analysis are compared to those obtained through MM-GBSA and computational

106

alanine scanning, and experimentally probed by point mutations, enzyme expression and activity

107

measurements. Finally, the general validity of the observed differential roles of specific

ACS Paragon Plus Environment

5

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 30

108

sequence-structure combinations is investigated through the analysis of a larger set of homologs.

109

Importantly, structural hotspots are found to be highly conserved in a subset of proteins sharing

110

the 3D architectures common to LEHs, despite low sequence similarities.

111

Materials and Methods

112

The experimental procedure for the preparation and characterization of mutants are reported in

113

the Supporting Information.

114

Theoretical Calculations

115

Re-LEH, Tomsk-LEH, CH55-LEH, Re-LEH-P and Re-LEH-F1b (PDB-code 1nu3, 5aif, 5aih,

116

4R9K and 4R9L respectively) crystallographic structures were refined by Schrodinger Maestro

117

suite (Release 2013-1-9.4, Schrödinger, LLC, New York, NY, 2013). MD simulations were

118

performed using AMBER 12.0 package with AMBER ff99SB.21 For computational details on

119

model preparation see SI section.

120

Distance fluctuations (DF) matrix and Local Flexibility (LF). DF analysis describes the

121

dynamic coordination between any two residues.20 It is defined as the time-dependent mean

122

square fluctuation of the distance rij between Cα atoms of residues i and j:

123

DFij = 〈(rij-〈rij〉)2〉

124

Where brackets indicate the time-average over the trajectory.

125

Local flexibility is used to assess the intrinsic plasticity properties of a protein undergoing

126

structural fluctuations. LF is obtained by calculating the DF of neighboring residues j comprised

127

in the interval (i - 2, i + 2) along the sequence.

ACS Paragon Plus Environment

6

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

128

Journal of Chemical Information and Modeling

Energy Decomposition Method (EDM). Energy Decomposition Method is based on the

129

calculation of non-bonded interaction energy matrix Enb (namely van der Waals and electrostatic)

130

between residues pairs. Stabilizing hotspots are identified through the eigenvalue decomposition

131

of the Enb matrix of pair interactions. Such decomposition is carried out on the representative

132

structure of the most populated conformational cluster obtained from cluster analysis of the MD

133

trajectories described above. This approach has previously shown to reproduce the behavior of

134

the same decomposition carried out by calculating non-bonded interactions on all the structures

135

from the whole simulation and then evaluating the average matrix.18,19,22 The highest

136

components of the principal eigenvector (associated with the lowest eigenvalue) identify crucial

137

amino acids necessary to stabilize a certain protein conformation. See references18,19 for further

138

details.

139

Results and Discussion.

140

Internal dynamics and coordination analysis. We started our comparative analysis of

141

hydrolases by dissecting those protein regions whose internal dynamics could define common

142

features as well as pinpoint distinctive elements. To this purpose, we analyzed long-range

143

communication networks through the calculation of the distance fluctuations (DF) matrix

144

between residue pairs: any two residues are defined to be quasi-rigidly coordinated if their DF is

145

low.20 DF matrices show a common general pattern among the five LEHs (see figure 1). In

146

particular each of the five systems exhibits a characteristic coordination bubble (red square).

147

Mapping this bubble onto the structures reveals that this pattern is due to the internal

148

coordination among β3, β4 and β5 strands. This region corresponds to a rigid core that defines

149

the internal coordination within the proteins. The comparison of the matrices highlights some

150

interesting differences. The insertion of the stabilizing mutations in Re-LEH clearly alters the

ACS Paragon Plus Environment

7

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 30

151

global internal dynamic of the enzyme. The Re-LEH-P variant indeed shows higher rigidity and

152

an extended coordination pattern with respect to the Re-LEH wild-type; this is particularly

153

remarkable in the terminal regions of each monomer. This trend is further strengthened by the

154

introduction of the disulfide bridges as distinctively pinpointed by the DF profile of Re-LEH-F1b

155

mutant, where the yellow lines (low coordination) corresponding to the N- and C- terminal tails

156

that are visible in the Re-LEH DF matrix, almost disappear

157 158

Figure 1. DF matrices of Re-LEH, Re-LEH-P, Re-LEH-F1b, Tomsk-LEH and CH55-LEH. The

159

red square shows the zoom on the coordination bubble. Β3, β4 and β5 structural domains are

160

mapped into the protein according to the respective color code. The yellow and dark blue areas

ACS Paragon Plus Environment

8

Page 9 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

161

in the matrices are associated to flexible and rigid regions. The color bar on the right reports the

162

intensity (Å2) of the fluctuations.

163

The comparison of the dynamic behaviors of the three wild-type LEHs demonstrates for

164

CH55-LEH a highly coordinated internal dynamics, while Re-LEH has a larger number of

165

flexible subdomains. Tomsk-LEH appears to be an intermediated case.

166

Overall, these rigidity/coordination trends parallel experimental thermostability data16: the

167

increased flexibility in the wild-type Re-LEH may point to a system more prone to unfolding,

168

whereas the higher mechanical coordination of the two mutants and CH55-LEH favors their

169

stability. From the DF analysis Re-LEH-F1b turns to be the most stable LEH, which also

170

reverberates its remarkable experimental thermostability.

171

Next, we set out to locate the structural elements responsible for the different flexibilities. The

172

aim here is to identify which substructures are exploited to tune the dynamic properties specific

173

for each enzyme, despite the highly conserved and shared structural patterns.10

174

To this end we calculated and compared the local flexibility (LF) parameter. This analysis

175

provides information on the average deformation that is locally experienced by stretches of

176

residues. Figure S3 shows that the five hydrolases share very similar profiles; the main

177

difference arises from the higher flexibility of Re-LEH N-terminal tails (missing in both CH55-

178

LEH and Tomsk-LEH). The N-terminal tails mobility is reduced in the two Re-LEH mutants

179

where two residues per monomer have been mutated (S15P and A19K) in both the variants and

180

one disulfide bond inserted (I5C-E84C) in the Re-LEH-F1b case. Therefore, the loop close to the

181

N-terminus turns out to be an important modulator, whose function is related with an increased

ACS Paragon Plus Environment

9

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

182

flexibility, confirming the results of Janssen’ group, whereby Re-LEH thermostability was

183

significantly enhanced by means of multiple mutations that kinetically stabilize this region.23

184

Energy decomposition reveals common stabilizing hotspots. Next, we set out to identify

Page 10 of 30

185

LEHs stabilization determinants using the Energy Decomposition Method (EDM).18 This

186

analysis is aimed at verifying whether common sets of residues can be identified as shared

187

stabilizing hotspots among the five enzymes, even in the face of low homology. EDM is

188

designed to detect energetic residue-couplings that are relevant mainly for the enthalpic

189

contribution to the stabilization of a certain 3D structure, providing a simplified view of residue-

190

residue pair interactions matrix. Although this method has previously been validated against

191

several experimental data,19,24 we reported in the SI the comparison with two other widely used

192

methods, namely MM-GBSA decomposition and Alanine Scanning analysis. The convergence of

193

the results supports the validity of our approach and conclusions.

194

Since we are interested in determining whether a conserved 3D structure can reflect a well-

195

defined energy signature, we set out to investigate the relationship between the distribution of

196

stabilizing determinants and the shared 3D fold of the hydrolases. Therefore, we used the POSA

197

server25 in order to perform multiple protein structure alignment and find out the common

198

regions among the five LEHs (defined as blocks, see SI). The latter were then used to simplify

199

and filter EDM energy matrices. Namely, only the residues belonging to the common regions

200

(blocks) are considered in the analysis. Furthermore an averaged EDM value accounts for the

201

contribution of each block to protein stability. POSA alignment returns six blocks (per monomer)

202

shared by the five hydrolases, where each block delimits the energy pattern of a common

203

structural region. Therefore, the diagonal of the matrix describes the contribution of each

204

structural domain to the whole stabilization energy, whereas off-diagonal elements codify for the

ACS Paragon Plus Environment

10

Page 11 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

205

energetic coupling interaction among the different blocks. As a caveat, it must be considered

206

again that this approach reports mainly on the enthalpic contribution of different substructures to

207

the stabilization of the proteins. To correctly evaluate the different contributions to the full free

208

energy, we should be able to properly consider the most relevant ensemble of conformations in

209

the unfolded states of the different proteins: this task is however out of reach for our current

210

capabilities. Moreover, it is worth noting that the proteins studied here are dimers, which adds a

211

further layer of complexity to the description of non-native states. Therefore, using a simplified

212

method (like EDM) based on information available for the native state appears to be a viable

213

solution.

214

Corroborating DF outcome, this analysis points out that all the LEHs share a common folding

215

core defined by the β4, β5, β6-α4 domains (yellow, green and magenta respectively in figure 2),

216

which we label core A (black square). This core engages in a stabilizing interaction with β3

217

(orange block), broadening into a second layer of stabilization, core B (white square). This

218

folding unit, responsible for the stabilization within each monomer, accounts as well for the main

219

energetic coupling between the two monomers in the dimer.

ACS Paragon Plus Environment

11

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

220 221

Figure 2. Enb matrices for Re-LEH, Re-LEH-P, Re-LEH-F1b, Tomsk-LEH and CH55-LEH.

222

Shared structural domains identified by colored bars in the matrices are mapped into the protein

223

according to the respective color code (bottom left). The relevance of the stabilization

224

contribution within the system decreases from blue to yellow.

225

Moreover, the matrices point to potential different factors in the fold-stabilization mechanism

226

exploited by Tomsk-LEH, CH55-LEH and Re-LEH. The first two LEHs show a more extended

227

stabilizing core, where each subset significantly contributes to the total folding nucleus. On the

228

contrary Re-LEH shows a more polarized pattern within each monomer profile. These

229

differences help explain the higher thermostability shown by Tomsk-LEH and CH55-LEH16:

230

extended and homogenous folding nuclei, where several structural domains contribute to the

ACS Paragon Plus Environment

12

Page 13 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

231

folding energy, could more efficiently stabilize a protein, whereas a localized core, as in Re-

232

LEH, may be related to a lower global stability. In this framework the extended cooperative

233

interaction among multiple stabilization elements could result in systems that are more prone to

234

adapt to changes by withstanding mutations whose impact on the 3D organization is minimized.

235

The same trend can be recognized within the Re-LEHs variants subset. Indeed the comparison of

236

the three Re-LEH matrices shows the tendency, going from the less stable Re-LEH towards the

237

more stable Re-LEH-F1b, to extend the stabilization nuclei within each monomer. At the same

238

time an effect of the mutations on the homodimer symmetry can be observed. The Re-LEH-P

239

energy matrix indicates that the mutations induce a significant stabilization of monomer A. This

240

pattern is further reinforced in Re-LEH-F1b. This observation finds a structural explanation by

241

comparing the most representative conformations of the three Re-LEH variants obtained by

242

cluster analysis of MD trajectories. While the mutations do not alter the distance between the two

243

centroids described by the interface residues of each monomer, they do impact on their

244

respective orientation. In fact the two mutants undergo a closing motion around the plane of the

245

dimer interface, which results in a rotation between the two principal axes of each monomer

246

(63.9°, 62.6° and 60.7° for Re-LEH, Re-LEH-P and Re-LEH-F1b respectively) and eventually

247

yields a more compact arrangement of the mutants.

248

Next, we addressed the question of which residues contribute the most to the global

249

stabilization energy. Hence the non-bonded energy profile associated to the eigenvector with the

250

lowest eigenvalue was selected and analyzed (see Methods and references18,19).

251

The first eigenvector profiles shown in figure 3 report on the contribution of each residue to

252

the stabilization energy, in other words, the relative intensities of the eigenvector components

253

describe the distribution of stabilizing interactions in the 3D structures of Re-LEH, Tomsk-LEH

ACS Paragon Plus Environment

13

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 30

254

and CH55-LEH. It must be noted that since the energy fingerprint in the three Re-LEHs variants

255

is not affected by the introduction of the mutations (see S5) only Re-LEH profile is reported and

256

discussed for clarity; moreover to compare the main folding determinants, the energy profiles

257

have been realigned following sequence alignment (Re-LEH numeration). Interestingly the

258

residues that contribute the most to the energetic stabilization (red circles, peak 1 and 3) belong

259

to the same protein sites in each of the three native hydrolases and identify a first group of

260

hotspots. The inspection of these sites reveals that these peaks identify two charged residues (an

261

arginine and a glutamate) for each monomer that are conserved in Re-LEH, Tomsk-LEH and

262

CH55-LEH (GLU98-ARG131, GLU77-ARG110, and GLU79-ARG111 in Re-LEH, Tomsk-

263

LEH and CH55-LEH respectively). These residues are close but do not belong to the active sites

264

pocket and stabilize the dimer interfaces by forming a salt bridge interaction with the oppositely

265

charged residue on the opposite monomer; i.e., Tomsk-LEH monomer-A GLU77 and ARG110

266

bridge monomer-B ARG110 and GLU77 respectively (see figure 3).

267

ACS Paragon Plus Environment

14

Page 15 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

268

Figure 3. Left. Components (in absolute value) of the eigenvectors associated with the lowest

269

eigenvalue of the Enb matrices. Because of dimeric symmetry, the graph reports only first

270

monomer profiles, see SI for the whole dimer profile Right. Common stabilizing hotspots

271

identified by means of EDM (Tomsk-LEH numeration).

272

The orange circle (peak 2) in figure 3 pinpoints a second stabilizing region shared by the three

273

LEHs. This site corresponds to a hydrophobic segment from β5, lying at the dimeric interface

274

(orange in figure 3). It is worth noting that both the stabilizing groups of hotspots underlie inter-

275

monomer interactions, suggesting that the dimer interface is the key structural element

276

determining the energetic stabilization of the active state of the enzyme, in agreement with

277

recently published experimental data.23

278

Interestingly, while the three systems exploit the common hydrophobic hotspot core to

279

stabilize the dimer, Re-LEH lacks two important key interactions (table 1). Indeed both Tomsk-

280

LEH and CH55-LEH use a negative charged amino acid (GLU99 and ASP101 in Tomsk-LEH

281

and CH55-LEH respectively) to create an important intra-monomer salt bridge that binds the

282

hotspot arginine (see figure 3). This contributes to connect β5 and β6. Furthermore Tomsk-LEH

283

ARG93 stably links β5 and α4 domains through an electrostatic interaction with GLU117.

284

Similarly, CH55-LEH ARG95 stabilizes the dimeric interface binding an aspartate on the

285

opposite monomer β1.

286

In contrast, in Re-LEH the respective amino acid positions are both replaced by two uncharged

287

residues. This reverberates in decreased contributions to the global energy stabilization of

288

SER115 and GLN121 with respect to the charged residues in Tomsk-LEH and CH55-LEH (see

289

table 1). In the framework of the EDM method, this difference can aptly be considered to fine-

ACS Paragon Plus Environment

15

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

290

tune the different stability properties in the three LEHs, with Re-LEH being the least thermally

291

stable.

292

Table 1. Hotspot residues corresponding to the second group of mutants. Orange selection

293

indicates hydrophobic amino acids.

Tomsk-LEH ARG93

Residue Residue Residue Rh-LEH CH55-LEH contribution contribution contribution 0.14 SER115 0.08 ARG95 0.14

VAL94

0.14

ILE116

0.13

VAL96

0.14

MET95

0.12

LEU117

0.12

MET97

0.13

GLY96

0.09

GLY118

0.11

GLY98

0.10

THR97

0.13

VAL119

0.11

ALA99

0.10

PHE98

0.12

PHE120

0.07

PHE100

0.10

GLU99

0.14

GLN121

0.04

ASP101

0.11

294 295

Overall, the data show that a limited number of key stabilization determinants, located in

296

common regions can provide a framework upon which a common structural organization can be

297

supported. Once the determinants of a stable fold are in place to guarantee structural stability, the

298

remainder of the sequence can be modulated in response to different evolutionary pressures.

299

Modification of structural determinants. The effect on protein stability of the energy

300

determinants that EDM analyses pointed out was probed by point mutations. Two sets of mutants

301

were designed and experimentally expressed. The properties of the resulting enzymes were

302

tested by means of Circular Dichroism (CD) analysis and evaluating specific activities toward

303

cyclohexene oxide.16 This first group of mutants was expected to destabilize protein stability by

304

perturbing the fundamental GLU77-ARG110 salt bridge interaction by means of side-chain

305

trimming (E77A and R110A) and salt bridge charge inversion (R110E_E77R). The latter

ACS Paragon Plus Environment

16

Page 17 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

306

allowed to investigate the local effect of the charge inversion whilst the inter-monomer salt

307

bridge was retained. From now on, Tomsk-LEH numeration will be used for clarity.

308

A second set of mutants entailed two new LEHs where the stabilizing contribution of ARG93

309

was altered by mutations of GLU117: deletion of the interaction was probed in the E117A

310

mutant, while the effect of shortening and reducing the conformational mobility of the negative

311

sidechain was tested in E117D. We focused on perturbing ARG93-GLU117 salt bridge since this

312

electrostatics-based stabilizing interaction is missing in Re-LEH, as discussed above. Thus, a

313

possible correlation between the presence of this distinctive determinant and the different

314

thermostabilities could be investigated.

315

Finally, MET95 contribution was also probed by point mutation: MET95 is located at the

316

interface between the two monomers and conserved both in Tomsk-LEH and CH55-LEH, but

317

missing in Re-LEH, suggesting a possible role as distinctive modulator. Therefore the expression

318

of M95C mutant aimed to test the perturbative effect on the interface stability obtainable by

319

altering the local hydrophobicity and hydrogen bond network. It is worth to underline herein that

320

the disulfide bridge formation between the two CYS facing each other on the opposite monomers

321

was out of scope of our investigation and its presence was excluded by SDS-PAGE analysis (see

322

SI).

323

In this scenario, Tomsk-LEH was chosen as reference system where to express and test all

324

mutations. Indeed Re-LEH did not offer a representative testing case due to the particular

325

mobility of its N-terminal tails and the lack of the energy modulators shown by the other LEHs.

326

Among designed mutants, E77A and E77R_R110E were successfully expressed in E. coli.

327

Mutation R110A resulted in negligible expression levels, which may suggest a dramatic impact

ACS Paragon Plus Environment

17

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 30

328

on the stability of the resulting protein (see figure S9). Wild-type (wt), E77A and E77R_R110E

329

were characterized by means of comparative CD analysis in both far and near UV regions, to

330

assess possible distinctive features within their secondary and tertiary structure, respectively.

331

Likewise, structural thermal stability was also evaluated by running melting temperature (Tm)

332

curves and monitoring the unfolding of secondary (222 nm) and tertiary (296 nm) structure.

333

CD analysis revealed that E77A retained a similar secondary and tertiary structure to the wt

334

(figure S10), as CD spectra in both far and near UV regions are substantially overlapped. On the

335

contrary, E77R_R110E mutant CD spectra suggested that a structural perturbation occurred

336

(figure S11). In this case, while the charge inversion preserves the global electrostatics between

337

the two interfaces, at the intramonomer level, switching to an opposite charge perturbs local

338

networks by introducing repulsive interactions. In particular wt ARG110 is stably engaged in an

339

intramonomer interaction with the hotspot residue GLU99. Thus E77R_R110E mutation implies

340

both the breaking of this stabilizing interaction between two hotspots and the introduction of a

341

local excess of negative charge.

342

As hypothesized, mutation E77A and E77R_R110E significantly altered protein stability;

343

E77A and E77R_R110E showed a Tm of 52.8°C and 55.3°C (near UV), respectively,

344

significantly lower compared to the wt Tm of 70°C (table 2). Interestingly, while Tm of the wt

345

was clearly detectable at 222 nm (69.9°C), there was no clear transition in the denaturation curve

346

of mutants, possibly because of their more flexible or less compact structure.

347 348

Furthermore, mutants’ specific activity toward cyclohexene oxide was significantly lower compared to the wt (table 2). Hence corroborating the computational analyses, hotspot

ACS Paragon Plus Environment

18

Page 19 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

349

determinants turned out to be fundamental for the enzymatic stability. The perturbation of the

350

latter property was detrimental to biological functionality.

351

Table 2. Tomsk-LEH wt and mutants Tm determined at 222 nm and 296 nm and specific activity

352

toward cyclohexene oxide. The specific activities were determined at 20°C.1.

Enzyme

Tm@222 nm (°C)

Tm@296nm (°C)

Specific activity (mU/mg)

Tomsk-LEH wt

69.9

70.0

320

E77A

n.d.

52.8

164

E77R_R110E

n.d.

55.3

140

E117A

52.9

54.7

188

E117D

51.8

55.2

274

M95C

60.6

58.6

310

353 354

All the other designed mutants, E117A, E117D and M95C were successfully expressed in E.

355

coli (figure S9). CD characterization showed that these mutations did not significantly perturb

356

the secondary structure of the enzyme (figure S13, S14, S15), while a different impact could be

357

noticed on the tertiary structures. Indeed, E117A and E117D showed increased absorption

358

intensities compared to the wt in near UV spectra, consistent with structural perturbation. A

359

similar trend was observed for M95C mutation, though less pronounced. This may reflect an

360

increased tendency to aggregation, due to the destabilization of the dimeric form. This

361

observation agrees with the measured Tm values for these mutants: both E117A and E117D

362

showed lower Tm than wt, whereas M95C showed an intermediated behavior.

1

N.d.= not detected

ACS Paragon Plus Environment

19

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 30

363

Finally the specific activity toward cyclohexene oxide was tested (see table 2). Interestingly,

364

the effects of the mutated residues were almost negligible for M95C and E117D, which could be

365

explained by the fact that these mutations only partially alter the chemical interactions with the

366

surrounding region (basically by shortening the side chain in both cases). On the contrary E117A

367

showed a lower activity than the other mutants of its group, yet higher compared to the first

368

group of mutants. In general these data point toward a possible correlation between the

369

thermostability (i.e., Tm) and the enzymatic efficiency. Even if the considered subset is too small

370

to be quantitatively statistically relevant, the good computed correlation factor (0.7) suggests a

371

qualitative trend linking mutant induced modulation of thermostability and enzymatic activity.

372

This result could to some extent be expected: in general, destabilized mutants tend to be less

373

efficient as catalysts compared to wild type molecules. The inactivation may in fact be linked to

374

the disruption of the active site pre-organization due to structural instabilities. Alternatively, one

375

may hypothesize a coupling between folding and reaction coordinates that may lead to reactant

376

state destabilization and transition state stabilization in the direction of folding and along the

377

reaction coordinate, respectively, as previously proposed by Aquist and coworkers.26In this

378

framework, perturbing the coupling by destabilizing mutations may also have a negative effect

379

on reactivity. To verify such a mechanistic model, however, different approaches than the one

380

presented here should be used.

381

The effects of mutations in the enzymes studied here can also be viewed in terms of the concept

382

of fold polarity and innovability introduced by Tawfik and coworkers10,27: in their model, the key

383

to innovability (i.e. the ability to support mutations that introduce new functions) is fold polarity,

384

whereby the active site is composed of flexible, loosely packed loops that coexist with a well-

385

separated, highly ordered scaffold. The latter provides the necessary stability to withstand

ACS Paragon Plus Environment

20

Page 21 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

386

mutations while the active-site loops maintain their conformational plasticity, which may

387

promote the acquisition/optimization of functions.

388

In the LEH structures, much like in the case of dihydrofolate reductase discussed by Tawifk, the

389

active site residues are highly coordinated with the rest of the protein in general and with the

390

most relevant stabilizing hotspots in particular (see S18). In this frame of thought, the presence

391

of a diffuse stabilization nucleus, while on the one hand favors adaptation to higher temperatures,

392

on the other hand opposes the accumulation of mutations resulting in new functions, e.g. in

393

enzymes being functional at lower or higher temperatures than the ones for which they were

394

naturally evolved. Janssen and coworkers were able to overcome this hurdle by the use of

395

carefully designed disulfide bridges.17,23

396

One potentially interesting aspect of the results reported in table 2 is that the more

397

thermostable enzyme shows higher activity than the less stable designed mutants at 20 degrees.

398

This observation is in line with previous observations28–31 which indicated that enhanced stability

399

does not necessarily have to correlate with reduced activity.

400

It is conceivable that the methods used in laboratory evolution, based on various rounds of

401

“targeted” selection and mutation of proteins with a defined activity,32 may in some cases favor

402

molecules that conserve a certain reactivity in the background while stability is being improved.

403

In contrast, natural evolutionary pressure requires activity to be present only at the temperature at

404

which a certain organism needs to survive, discarding all other candidates that are not stable

405

enough despite the fact that they may possibly maintain a certain reactivity.

406

In conclusion, we have shown that the perturbation of predicted hotspots, shared by three

407

proteins with a low degree of homology and identified through a computational unbiased energy-

408

based analysis that does not require previous knowledge on residue-conservation, may impact on

ACS Paragon Plus Environment

21

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 30

409

the correct folding, stability and biological reactivity, providing rational information on potential

410

ways to perturb stability-activity relationships. Our approach could expectedly represent a valid

411

complement to other methods, such as the ones based on disulfide-bridge stabilization.

412 413

Generalization to a large set of homologues with disparate functions. Finally, we extended

414

our considerations by analyzing a number of structural homologues of the enzymes studied

415

herein. Dali search33 of the Protein Data Bank (PDB) using Re-LEH monomer A as probe

416

returned several structural homologues. Figure 4 shows top ten hits according to Dali score.

417 418

Figure 4. Structural alignment obtained using Re-LEH monomer A as probe. Red and orange

419

squares pinpoint hotspot residues corresponding to the first and second group of mutants

420

respectively.

421

This group consists of different unidentified proteins expressed in several bacteria strains and

422

two digoxigenin binder proteins (4j8t and 5bvb). The structural alignment unveils that GLU77

423

and ARG110 (red squares) are conserved in 9 and 6 structures respectively out of 10. This

424

suggests that different proteins with different functions could exploit a common stabilization

425

mechanism. This is further corroborated if we consider that these proteins show very low

426

sequence identity (from 14 up to 33%). Hotspot residues corresponding to the more hydrophobic

427

stabilization determinants of the interface (orange square) show lower sequence identity

ACS Paragon Plus Environment

22

Page 23 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

428

conservation among the different structures. This could potentially reflect their accessory role in

429

the protein stabilization. However, it is interesting to note that the high hydrophobic character is

430

conserved in all cases, and 8 homologues out of ten share at least a charged residue on one edge

431

of the hydrophobic segments (5 of them have two charged residues at both ends). These data

432

validate the hypothesis that evolution generates functional diversity by local sequence variation.

433

Even so it preserves a common efficient stabilization strategy, driven by a limited number of

434

shared structural determinants.

435

Conclusions

436

By means of novel computational methods and experimental procedures we have identify the

437

conserved protein regions that mold common properties in a subset of homologous enzymes, the

438

limonene-1,2-epoxide hydrolases. We demonstrated that a limited number of stabilizing

439

determinants, located in conserved segments, define a common and distinguishing energy

440

signature in related proteins, despite low percentage of sequence similarity. The shared

441

determinants are found to have differential roles. The most conserved hotspots are essential for

442

protein stability, whereas a second group of hotspots may play an accessory role, contributing to

443

protein functional plasticity and showing a lower degree of sequence identity. In particular we

444

found that both these hotspot residues lie at the dimeric interfaces of LEHs that turn to be the

445

essential structural regulatory core of this subset of enzymes

446

Despite this common feature, we disentangle the protein regions that possibly act as

447

modulators, driving the mechanism of functional diversification in the three LEHs. Finally, we

448

extend our conclusions to a set of LEHs structural homologues. We report that the observed

449

hotspots are largely conserved in proteins sharing LEHs 3D fold, regardless of low sequence

ACS Paragon Plus Environment

23

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 30

450

identity and functional promiscuity. Ultimately, our data points at an evolutionary strategy where

451

each 3D fold is characterized by a specific and distinctive energy fingerprint, ruled by few

452

conserved structural determinants. Concurrently functional diversity is ensured by local

453

modulators that can be modified to rationally tune protein properties.

454 455

ASSOCIATED CONTENT

456

Supporting Information. This material is available free of charge via the Internet at

457

http://pubs.acs.org

458 459

AUTHOR INFORMATION

460

*Corresponding author: Giorgio Colombo, Istituto di Chimica del Riconoscimento

461

Molecolare, CNR; via Mario Bianco 9, 20131 Milano, Italy. E-mail: [email protected].

462

Tel: +39-02-28500031, Fax: +39-02-28901239.

463

Author Contributions

464

The manuscript was written through contributions of all authors. All authors have given approval

465

to the final version of the manuscript

466

Funding Sources

467

The authors acknowledge funding from SusChemLombardia: prodotti e processi sostenibili per

468

l’industria lombarda project, Accordo Quadro Regione Lombardia-CNR, Proj. Nr. 18096

469 470

ABBREVIATIONS

ACS Paragon Plus Environment

24

Page 25 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

471

LEH, limonene-1,2-epoxide-hydrolase; Re-LEH, Rhodococcus erythropolis limonene-1,2-

472

epoxide-hydrolase; Tomks-LEH, Tomks limonene-1,2-epoxide-hydrolase; CH55-LEH, CH55

473

limonene-1,2-epoxide-hydrolase; EH, epoxide hydrolase; PDB, Protein Data Bank (PDB); MD,

474

Molecular Dynamics; DF, Distance fluctuations; LF, Local Flexibility; EDM, Energy

475

Decomposition Method; Enb, non-bonded interaction energy matrix; CD, Circular Dichroism;

476

SDS-PAGE, Sodium Dodecyl Sulphate- PolyAcrylamide Gel Electrophoresis; wt, wild-type;

477

Tm, melting temperature.

478 479

References (1)

480 481

Folding. Adv. Protein Chem. 1975, 29, 205–300. (2)

482 483

Anfinsen, C. B. .; Scheraga, H. A. Experimental and Theoretical Aspects of Protein

Koonin, E. V.; Wolf, Y. I.; Karev, G. P. The Structure of the Protein Universe and Genome Evolution. Nature 2002, 420, 218–223.

(3)

Fowler, D. M.; Araya, C. L.; Fleishman, S. J.; Kellogg, E. H.; Stephany, J. J.; Baker, D.;

484

Fields, S. High-Resolution Mapping of Protein Sequence-Function Relationships. Nat.

485

Methods 2010, 7, 741–746.

486

(4)

487 488

Gorbalenya, A. E.; Koonin, E. V. Helicases: Amino Acid Sequence Comparisons and Structure-Function Relationships. Curr. Opin. Struct. Biol. 1993, 3, 419–429.

(5)

Cygler, M.; Schrag, J. D.; Sussman, J. L.; Harel, M.; Silman, I.; Gentry, M. K.; Doctor, B.

489

P. Relationship between Sequence Conservation and Three-Dimensional Structure in a

490

Large Family of Esterases, Lipases, and Related Proteins. Protein Sci. 2008, 2, 366–382.

491

(6)

Shoichet, B. K.; Baase, W. A.; Kuroki, R.; Matthews, B. W. A Relationship between

ACS Paragon Plus Environment

25

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

492 493

Protein Stability and Protein Function. Proc. Natl. Acad. Sci. 1995, 92, 452–456. (7)

494 495

(8)

(9)

(10)

Tóth-Petróczy, Á.; Tawfik, D. S. The Robustness and Innovability of Protein Folds. Curr. Opin. Struct. Biol. 2014, 26, 131–138.

(11)

502 503

Beadle, B. M.; Shoichet, B. K. Structural Bases of Stability–function Tradeoffs in Enzymes. J. Mol. Biol. 2002, 321, 285–296.

500 501

Bloom, J. D.; Labthavikul, S. T.; Otey, C. R.; Arnold, F. H. Protein Stability Promotes Evolvability. Proc. Natl. Acad. Sci. 2006, 103, 5869–5874.

498 499

Ashenberg, O.; Gong, L. I.; Bloom, J. D. Mutational Effects on Stability Are Largely Conserved during Protein Evolution. Proc. Natl. Acad. Sci. 2013, 110, 21071–21076.

496 497

Page 26 of 30

Nussinov, R.; Tsai, C.-J.; Liu, J. Principles of Allosteric Interactions in Cell Signaling. J. Am. Chem. Soc. 2014, 136, 17692–17701.

(12)

Widersten, M.; Gurell, A.; Lindberg, D. Structure–function Relationships of Epoxide

504

Hydrolases and Their Potential Use in Biocatalysis. Biochim. Biophys. Acta - Gen. Subj.

505

2010, 1800, 316–326.

506

(13)

507 508

Nardini, M.; Dijkstra, B. W. Α/β Hydrolase Fold Enzymes: The Family Keeps Growing. Curr. Opin. Struct. Biol. 1999, 9, 732–737.

(14)

Arand, M.; Hallberg, B. M.; Zou, J.; Bergfors, T.; Oesch, F.; Werf, M. J. van der; Bont, J.

509

A. M. de; Jones, T. A.; Mowbray, S. L. Structure of Rhodococcus Erythropolis Limonene-

510

1,2-Epoxide Hydrolase Reveals a Novel Active Site. EMBO J. 2003, 22, 2583–2592.

511

(15)

Van Der Werf, M. J.; Orru, R. V. A.; Overkamp, K. M.; Swarts, H. J.; Osprian, I.;

ACS Paragon Plus Environment

26

Page 27 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

512

Steinreiber, A.; De Bont, J. A. M.; Faber, K. Substrate Specificity and Stereospecificity of

513

Limonene-1,2-Epoxide Hydrolase from Rhodococcus Erythropolis DCL14; an Enzyme

514

Showing Sequential and Enantioconvergent Substrate Conversion. Appl. Microbiol.

515

Biotechnol. 1999.

516

(16)

Ferrandi, E. E.; Sayer, C.; Isupov, M. N.; Annovazzi, C.; Marchesi, C.; Iacobone, G.;

517

Peng, X.; Bonch-Osmolovskaya, E.; Wohlgemuth, R.; Littlechild, J. A.; et al. Discovery

518

and Characterization of Thermophilic Limonene-1,2-Epoxide Hydrolases from Hot Spring

519

Metagenomic Libraries. FEBS J. 2015, 282, 2879–2894.

520

(17)

Floor, R. J.; Wijma, H. J.; Jekel, P. A.; Terwisscha van Scheltinga, A. C.; Dijkstra, B. W.;

521

Janssen, D. B. X-Ray Crystallographic Validation of Structure Predictions Used in

522

Computational Design for Protein Stabilization. Proteins Struct. Funct. Bioinforma. 2015.

523

(18)

524 525

Genoni, A.; Morra, G.; Colombo, G. Identification of Domains in Protein Structures from the Analysis of Intramolecular Interactions. J. Phys. Chem. B 2012, 116, 3331–3343.

(19)

Morra, G.; Colombo, G. Relationship between Energy Distribution and Fold Stability:

526

Insights from Molecular Dynamics Simulations of Native and Mutant Proteins. Proteins

527

Struct. Funct. Bioinforma. 2008, 72, 660–672.

528

(20)

Morra, G.; Verkhivker, G.; Colombo, G. Modeling Signal Propagation Mechanisms and

529

Ligand-Based Conformational Dynamics of the Hsp90 Molecular Chaperone Full-Length

530

Dimer. PLoS Comput. Biol. 2009, 5, e1000323.

531 532

(21)

Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone

ACS Paragon Plus Environment

27

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

533 534

Page 28 of 30

Parameters. Proteins Struct. Funct. Bioinforma. 2006, 65, 712–725. (22)

Scarabelli, G.; Morra, G.; Colombo, G. Predicting Interaction Sites from the Energetics of

535

Isolated Proteins: A New Approach to Epitope Mapping. Biophys. J. 2010, 98, 1966–

536

1975.

537

(23)

Wijma, H. J.; Floor, R. J.; Jekel, P. A.; Baker, D.; Marrink, S. J.; Janssen, D. B.

538

Computationally Designed Libraries for Rapid Enzyme Stabilization. Protein Eng. Des.

539

Sel. 2014.

540

(24)

Ragona, L.; Colombo, G.; Catalano, M.; Molinari, H. Determinants of Protein Stability

541

and Folding: Comparative Analysis of Beta-Lactoglobulins and Liver Basic Fatty Acid

542

Binding Protein. Proteins Struct. Funct. Bioinforma. 2005, 61, 366–376.

543

(25)

544 545

Bioinformatics 2005, 21, 2362–2369. (26)

546 547

Ye, Y.; Godzik, A. Multiple Flexible Structure Alignment Using Partial Order Graphs.

Wallin, G.; Härd, T.; Åqvist, J. Folding-Reaction Coupling in a Self-Cleaving Protein. J. Chem. Theory Comput. 2012, 8, 3871–3879.

(27)

Dellus-Gur, E.; Toth-Petroczy, A.; Elias, M.; Tawfik, D. S. What Makes a Protein Fold

548

Amenable to Functional Innovation? Fold Polarity and Stability Trade-Offs. J. Mol. Biol.

549

2013, 425, 2609–2621.

550

(28)

Giorgio Colombo†, ‡ and; Kenneth M. Merz, J. *,. Stability and Activity of Mesophilic

551

Subtilisin E and Its Thermophilic Homolog:  Insights from Molecular Dynamics

552

Simulations. 1999.

ACS Paragon Plus Environment

28

Page 29 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

553

Journal of Chemical Information and Modeling

(29)

Wintrode, P. L.; Zhang, D.; Vaidehi, N.; Arnold, F. H.; Goddard, W. A. Protein Dynamics

554

in a Family of Laboratory Evolved Thermophilic Enzymes. J. Mol. Biol. 2003, 327, 745–

555

757.

556

(30)

LeMaster, D. M.; Tang, J.; Paredes, D. I.; Hernández, G. Enhanced Thermal Stability

557

Achieved without Increased Conformational Rigidity at Physiological Temperatures:

558

Spatial Propagation of Differential Flexibility in Rubredoxin Hybrids. Proteins Struct.

559

Funct. Bioinforma. 2005, 61, 608–616.

560

(31)

Wu, B.; Wijma, H. J.; Song, L.; Rozeboom, H. J.; Poloni, C.; Tian, Y.; Arif, M. I.;

561

Nuijens, T.; Quaedflieg, P. J. L. M.; Szymanski, W.; et al. Versatile Peptide C-Terminal

562

Functionalization via a Computationally Engineered Peptide Amidase. ACS Catal. 2016,

563

6, 5405–5414.

564

(32)

Arnold*, F. H. Design by Directed Evolution. 1998.

565

(33)

Holm, L.; Rosenström, P. Dali Server: Conservation Mapping in 3D. Nucleic Acids Res.

566

2010, 38 (Web Server issue), W545-9.

567

ACS Paragon Plus Environment

29

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

568

Page 30 of 30

For Table of Contents Use Only

569

Unraveling energy and dynamics determinants to interpret protein functional plasticity: the limonene-1,2-epoxide-hydrolases case study

Silvia Rinaldi, Alessandro Gori, Celeste Annovazzi, Erica E. Ferrandi, Daniela Monti, and Giorgio Colombo*.

570 571 572

ACS Paragon Plus Environment

30