Utility of B-Factors in Protein Science: Interpreting Rigidity, Flexibility

Jan 30, 2019 - Tianjin Institute of Industrial Biotechnology, Chinese Academy of ... in Microbiology in 2012 from Shanghai Jiao Tong University, China...
0 downloads 0 Views 11MB Size
Review pubs.acs.org/CR

Cite This: Chem. Rev. XXXX, XXX, XXX−XXX

Utility of B‑Factors in Protein Science: Interpreting Rigidity, Flexibility, and Internal Motion and Engineering Thermostability Zhoutong Sun,*,† Qian Liu,‡ Ge Qu,† Yan Feng,*,‡ and Manfred T. Reetz*,†,§,∥ †

Chem. Rev. Downloaded from pubs.acs.org by MIAMI UNIV on 01/31/19. For personal use only.

Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West Seventh Avenue, Tianjin Airport Economic Area, Tianjin 300308, China ‡ State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China § Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany ∥ Chemistry Department, Philipps-University, Hans-Meerwein-Strasse 4, 35032 Marburg, Germany ABSTRACT: The term B-factor, sometimes called the Debye−Waller factor, temperature factor, or atomic displacement parameter, is used in protein crystallography to describe the attenuation of X-ray or neutron scattering caused by thermal motion. This review begins with analyses of early protein studies which suggested that B-factors, available from the Protein Data Bank, can be used to identify the flexibility of atoms, side chains, or even whole regions. This requires a technique for obtaining normalized B-factors. Since then the exploitation of B-factors has been extensively elaborated and applied in a variety of studies with quite different goals, all having in common the identification and interpretation of rigidity, flexibility, and/or internal motion which are crucial in enzymes and in proteins in general. Importantly, this review includes a discussion of limitations and possible pitfalls when using B-factors. A second research area, which likewise exploits B-factors, is also reviewed, namely, the development of the so-called B-FIT-directed evolution method for increasing the thermostability of enzymes as catalysts in organic chemistry and biotechnology. In both research areas, a maximum of structural and mechanistic insights is gained when B-factor analyses are combined with other experimental and computational techniques.

CONTENTS 1. Introduction 1.1. Brief Historical Outline 1.2. Definition of B-Factor and Early Applications in Protein Science 1.3. Limitations and Possible Pitfalls When Applying B-Factors in Protein Science 2. Identifying and Interpreting Rigidity, Flexibility, and Dynamics in Proteins with the Help of BFactors 2.1. Identifying the Active Conformation of the Apo RORγt Nuclear Receptor 2.2. Understanding Flexibility and Thermal Lability of the Cε3 Domain in Human Immunoglobulin E (IgE) 2.3. Studying the Dynamic Allosteric Effect in αIIbß3 Integrin 2.4. Understanding Increased Activity of a Mutant Alcohol Dehydrogenase Evolved for Reversed Stereoselectivity 2.5. Compilation of Further Recent Studies Utilizing B-Factors for Identifying and Interpreting Protein Flexibility and Internal Motion

3. Utilization of B-Factors for Engineering Enzyme Thermostability and Robustness toward Hostile Solvents 3.1. Short Overview of Protein Engineering Methods 3.2. Initial Case Studies of the B-FIT Approach to Protein Thermostabilization 3.2.1. Lipase from B. subtilis (Lip A) 3.2.2. Epoxide Hydrolase from A. niger (ANEH) 3.3. Further B-FIT Studies Incorporating Additional Techniques 3.3.1. B-FIT Supported by PoPMuSiC: Type A Feruloyl Esterase (AuFaeA) from Aspergillus usamii 3.3.2. B-FIT Supported by Consensus Technique and PoPMuSiC: endo-1,4-ß-Galactanase from Talaromyces stipitatus (TSGAL) 3.3.3. B-FIT Supported by Optimal Expression Host: Endoglucanase I from Trichoderma reesei (TrEGI)

B B B D

D E

F F

G

H L O O P R

R

S

S

H Received: May 5, 2018

© XXXX American Chemical Society

A

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews 3.3.4. B-FIT Supported by Combining Mutations Followed by ISM: Maize Endosperm ADP-Glucose Pyrophosphorylase (AGPase) 3.3.5. One-Step Combined B-FIT SM and Focused epPCR As Applied to a ColdActive Xylanase 3.3.6. Simultaneous Multiparameter Directed Evolution for Enhanced Thermostability and Stereoselectivity with Maintained Activity 3.3.7. B-FIT as a Means to Increase Robustness toward Hostile Solvents 3.3.8. Rare Case of Comparing Different Protein Engineering Strategies for Stabilizing an Enzyme 3.3.9. Active Center Stabilization (ACS) Guided by Local B-Factors near Binding Pocket 3.3.10. Compilation of Selected B-FIT Studies 3.3.11. Going the Other Way: Directed Evolution of Protein Lability on the Basis of Low B-Factors 4. Conclusions and Perspectives Author Information Corresponding Authors ORCID Notes Biographies Acknowledgments References

Review

1.2. Definition of B-Factor and Early Applications in Protein Science

B-factor (or in general the Debye−Waller factor) describes the attenuation of X-ray scattering or coherent neutron scattering caused by thermal motion.1,2 The B-factor (sometimes called temperature factor or atomic displacement parameter)3−5 is defined according to eq 1

T

T

B = 8π 2 u 2 U

(1)

where u is the mean displacement of a scattering center, measured in Angstroms. As summarized by T. E. Creighton in a 1993 monograph written on the basis of numerous previous studies,79 B-factors refer to a decrease of intensity in diffraction as a result of two different phenomena, dynamic disorder caused by the temperature-dependent vibration of the atoms, and static disorder. In many earlier and later studies, scientists used B-values by focusing only on the properties of Cα atoms of the amino acids in proteins, which was believed to correlate with motion of the backbone.6,7,80 The data of every X-ray structure deposited in the Protein Data Bank (PDB) includes a B-factor for all atoms except for hydrogen. Several early studies suggested that B-factors can be used to identify flexibility in proteins, proposing that high Bfactors indicate higher than average flexibility as opposed to low B-factors which were believed to occur at more rigid positions.6,8−12 A seminal example was published by P. A. Karplus and G. E. Schulz, who focused on chain flexibility in linear B-cell epitopes by normalizing and analyzing the previously reported B-values at each C atom of a given protein.6 The resulting Bnorm-values were obtained from 31 refined protein structures, and the data subsequently served as a guide for establishing flexibility profiles which proved to be useful in the selection of peptide antigens. The authors also established an average relationship between B-value and type of amino acid.6 Shortly thereafter, M. Vihinen proposed and discussed in depth the correlation between protein rigidity and thermostability,8 which was later generalized. Another important contribution was made by S. Parthasarathy and M. R. N. Murphy, who analyzed the temperature factor distribution in a number of high-resolution protein structures and emphasized the importance of using normalized B-factors, which they called B′-factors.81 Later they extended this work by utilizing normalized B-factors in the analysis of the reasons why thermostable proteins are so robust9 (see introductory information in section 3). In yet another early study, M. Karplus and co-workers performed molecular dynamics (MD) calculations in order to reveal the dynamics of CO-myoglobin at 80 and 325 K.12 At low temperature, a large disorder contribution to B-factors was found.5,12 Temperature effects need to be considered in general. In a different early contribution, D. E. Tronrud emphasized the use of knowledge-based B-factor restraints for protein refinement, even if only low-resolution crystallographic data is available.82 However, this has not been applied widely. Section 1.3 treats general limitations and possible pitfalls when applying B-factors. The monograph by N. E. Chayen, J. R. Helliwell, and E. H. Snell describes particularly well the interrelationship between molecular flexibility in crystals on one hand and inherent crystal effects on the other, which needs to be considered when interpreting Bfactors.83

W

X Y AA

AA AB AC AC AC AC AC AC AD

1. INTRODUCTION Following introductory background information,1−75 this review is composed of two parts. Section 2 highlights and analyzes selected recent examples of the use of B-factors for identifying and interpreting dynamic characteristics of proteins for a variety of different purposes. The focus of the second part (section 3) concerns the utilization of B-factors in the so-called B-FIT76,77 directed evolution approach for enhancing the thermostabiliy of enzymes as catalysts in biotechnology and organic chemistry. 1.1. Brief Historical Outline

In 1913 Peter Debye provided a theoretical treatment of X-ray scattering and thermal motion in solid materials,1 which was subsequently analyzed, criticized, and modified by several physicists at the time, including Ivar Waller, Kathleen Lonsdale, Max von Laue, William H. Bragg, and Erwin Schrödinger. In 1942 Max Born critically summarized the status of this research (in English).78 Debye originally assumed that atoms in a crystal oscillate independently about their equilibrium position and consequently applied classical statistics. Accordingly, a decrease in intensity was noted but no effect on the sharpness of the Laue spots or Bragg reflections. Waller proposed that the Debye method of using normal coordinates was incorrect and therefore replaced the temperature factor by its square. As time went on, the designation “Debye−Waller factor” became the prevalent convention in the chemistry and physics communities.2 When applied to proteins, “B-factor” became the generally accepted term. B

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

A crucial question regarding biocatalysis is whether flexibility and rigidity as indicated by B-factors are merely inherent properties of an enzyme, possibly correlating with lability and stability, respectively, or whether they are also somehow related to activity. The utility of B-factors in this connection was demonstrated by several groups, e.g., by J. M. Thornton and coworkers in 2002.13 Shortly thereafter, Z. X. Wang and coworkers studied the frequency distributions of the normalized Bfactor for the active site and nonactive site residues of 69 apoenzymes.10 It was found that in all cases the active site residues occur mainly in regions of low B-factors, while the residues lining the binding pocket tend to exist in higher B-factor regions. Responses to this interesting finding appeared later. For example, B. Rost and co-workers provided support for this conclusion by a new B-factor analysis algorithm based on sequence prediction using the same data set of 69 apo-enzymes from the PDB;7 residues in the active site were found to have lower B-factors (less flexible) compared to nonactive site residues. This seems to be a general rule based on computer programs, which has been applied to predict active sites in enzymes.14−21 In another important study, a systematic investigation of protein flexibility and intrinsic disorder led to further insights.11 Four categories of protein flexibility were compared: Low B-factor-characterized regions,

and mechanistic purposes but also they serve as methods for engineering thermostability (section 3.3). Some algorithms enable not only the identification of flexible residues and regions but also the prediction of B-factors (computed absolute values), which is important in those cases in which crystal structures are not available, such as PROFbval,15 MoRFpred,30 and ResQ,31 which are computer prediction tools based on sequence data and homology structures. Most recently, D. Bramer and G.-W. Wei presented a new computational tool for predicting flexible protein regions and even for calculating B-factor values as such.32 However, all of these techniques are associated with notably more uncertainty relative to the use of B-factors derived from X-ray data, which themselves have an accuracy not better than 10−15%. A previous comprehensive review comparing existing computational tools with focus on intrinsically disordered proteins and so-called region prediction is also available, although in the featured studies B-factors are not routinely referred to.33 A number of other early B-factor studies appeared which also deserve special mention.34−41,85 For example, S. Sheriff and coworkers published a study in which the B-factor values of the myohemerythrin from Themiste zostericola and of the octameric hemerythrin from Themiste dyscrita were compared in order to see whether any differences correlate with solvent-accessible areas in the respective crystals.34 Indeed, a correlation between atomic mobility and solvent accessibility was found. In an investigation reported by M. V. Milburn and co-workers concerning ligand binding and coactivator assembly of the human peroxisome proliferator-activated receptor-γ, which is a ligand-dependent transcription factor of importance in adipocyte differentiation and glucose homeostasis, X-ray structures and their interpretation played a crucial role.43 Among several insights, elevated B-factors for the H12 helix and ligand-binding pocket residues were reported and interpreted. In summary, these and numerous other studies support the current general view that B-factors are indicators of the relative vibrational motion of atoms in a protein, those with low values belonging to a well-ordered site, and those with the highest values being part of the most flexible residues or regions. However, as also emphasized by a reviewer, it must be remembered that a B-factor reflects both vibration and static disorder. In many if not most studies, multiple temperature measurements for separating the two effects were not made, which, however, would contribute to absolutely sound interpretations. The contributions of Y. Cho et al.86 and A. Merlino et al.87 concerning the role of B-factors in understanding the structural basis of cold adaptation of psychrophilic enzymes are also noteworthy. In the Cho study, the flexibility of residues around the active site of a psychrophilic malate dehydrogenase was compared to those of a mesophilic analog having a similar sequence. In order to devise an equivalent scale of thermal parameters, relative B-factors were derived using several methods. Increased relative flexibility at and near the active site region was one of the central conclusions. A. Merlino and coworkers utilized the concept of relative B-factors in the study of superoxide dismutases and likewise came to a similar conclusion.87 A given protein crystal structure usually shows only one (“frozen”) conformation. In those cases in which highly flexible regions are indicated by exceptionally high B-factors, several different conformations are likely to exist in solution. The value of such B-factor analyses is then evident, because they suggest to

High B-factor-characterized regions, Short disordered regions, Long disordered regions. The composition of amino acids in these categories was found to differ significantly, high B-factor-characterized and short disordered regions appearing to be a similar pair. It was discovered that high B-factor-characterized regions show a higher average flexibility index, more pronounced average hydrophilicity, and higher absolute net charge.11 It needs to be pointed out that in their study the authors actually used the designation “high B-factor-ordered regions”, the word “ordered” having a different meaning than the traditional use of this word in protein science. Since the original term may cause confusion, the authors of the present review have substituted “ordered” for “characterized”, in agreement with the authors.84 In their study they state that “Comparing the B-factor values f rom highly similar pairs of crystallized chains provides evidence that f lexibility is encoded at the amino acid sequence level to a significant degree and therefore should be predictable, at some level, f rom the amino acid sequence. However, because of variations that result f rom experimental conditions, crystal contacts, or ref inement procedures, the B-factor data are noisy.”11 The authors developed a predictor system based on evolutionary modeling, capable of discriminating between regions of high and low B-factors, which was claimed to achieve an accuracy of 70% and a correlation of 0.43 with experimental data.11 Accuracy proved to be much higher than flexibility indices which were used at the time, but from today’s perspective, further improvements are necessary. The so-called RONN-Software for identifying disordered regions of proteins is another useful tool,22 especially when crystal structures are lacking. Other software aids14−21 for identifying flexible regions include the early Maranasalgorithms,23 SCHEMA,24 FireProt,25,26 Constrained Network Analysis (CNA),27,28 and Rosetta.29 An efficient computational approach comprising three algorithms (FRESCO) is highlighted and compared with B-FIT in section 3.3.6. Not only are these software aids useful in identifying flexible regions for structural C

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

As summarized by A. Schlessinger and B. Rost7 on the basis of many previous studies by other groups, a B-factor does not represent an absolute quantity and therefore needs to be normalized, for which different procedures have been applied as noted once again in a recent study.75 As such, a B-factor is the result of a number of factors, including degree of resolution, crystal contacts, and the particular refinement procedure.87 Therefore, non-normalized B-factors from different structures cannot be directly compared. The potential user needs to consider this issue before drawing any conclusions. Usually eq 2 is used to compute the normalized B-factor7,81

the researcher that additional structural and mechanistic investigations are likely to be rewarding. Indeed, conformational ensembles have been identified by NMR spectroscopy, as in adenylate kinase,44 such conformational heterogeneity being observed in other cases as well.45−49 The combination of Bfactor analysis, NMR interpretations, and MD computations is a powerful approach for gaining different types of insights. The general question of how dynamics of enzymes influence catalysis on a molecular level, including the role of conformational ensembles and their dynamic shifts as well as allosteric effects, is in itself an active research area which has been reviewed elsewhere.50−59 MD simulations play a crucial role in these endeavors.60−62 Another interesting utility of B-factors was reported by J. Wu and co-workers, who demonstrated a correlation between a defined degree of flexibility and antimicrobial activity of amphipathic cationic α-helical peptides.88,89 In addition to calculating a hydrophobicity index (H), they proposed a flexibility index (F index) which is related to B-factors.

Bnorm = (B − ⟨B⟩)/σ

(2)

where ⟨B⟩ is the average of the B-factor (raw value) considered in a given structure and σ represents the standard deviation. The normalized B-factor as defined in this manner can be used to describe the characteristics of, e.g., the Cα atoms of backbone chains. Today the concept of normalized B-factors is routinely used in protein crystallography22,23,25−28 and computational analyses.29−31 In further recent work, A. Kuzmanic, N. S. Pannu, and B. Zagrovic pointed out that a given X-ray refinement process may underestimate the level of microscopic heterogeneity in proteins.90 Using a crystal containing 216 copies of villin headpiece, a 35-residue 3-helix bundle protein often used as a model system in biophysics, they showed that even at high resolution refined B-factors may deviate significantly from their values derived from simulation. Conformational averaging and inadequate assessment of correlated motion appear to influence the estimation of microscopic heterogeneity derived from Bfactors in this particular system.90 It remains to be seen how general these conclusions are when utilizing advanced refinement procedures in the analysis of other proteins. In summary, B-factors available from high-resolution X-ray data have been shown to be useful in many types of protein studies, but using them alone is generally insufficient for gaining a maximum of insights. Deep-seated interpretations on a molecular level require additional experimental data as well as a theoretical work in the quest to learn fundamental mechanistic and structural lessons in protein science. Along a different yet intrinsically related line of research, Bfactors available in the literature can be exploited with the aim of enhancing protein thermostability by directed evolution in a process called B-FIT (section 3).76,77 This is of particular importance when focusing on enzymes as catalysts in organic chemistry and biotechnology.

1.3. Limitations and Possible Pitfalls When Applying B-Factors in Protein Science

The user should be aware of limitations and possible pitfalls when considering B-factors as a basis for drawing conclusions regarding flexibility and internal motion in proteins.5,12,63−66 One important point concerns the degree of resolution achieved in the respective X-ray analysis. It is known that low resolution correlates with (too) high B-factors, i.e., a resolution of only 3−5 Å can lead to B-factors as high as 100−200, which should not be used for making specific conclusions. In welcome contrast, Bfactors derived from protein structures having resolutions of ∼1.5 Å are more likely to be sound. It has been pointed out that even then B-factors may have drawbacks in certain applications, as in the development of elastic network models (ENMs)67 and limitations in certain other applications.68,69 Noises from lattice disorder, crystal packing effects, and type of structure refinement may lead to a discrepancy between B-factors and the root-meansquare fluctuations of the atoms (RMSD).12 Moreover, M. Karplus and co-workers have shown in a study of CO-myoglobin at 1.5 Å resolution that the uncertainty in B-factors can be as much as 15%.70 This should be accepted as a general guideline when utilizing B-factors in studies of other proteins. A strategy for avoiding overestimation of extremely large Bfactors and arriving at false structural and mechanistic conclusions was recently outlined by O. Carugo in a 2018 study entitled “How large B-factors can be in protein crystal structures”.66 Briefly said, the upper limit (Bmax) can be ascertained by extrapolating the relationship between the average B-factor, determined experimentally, and the percentage of crystal volume occupied by solvent. This helps to prevent otherwise absurd conclusions. It should also be mentioned that sometimes “Wilson Bfactors” are used, a variation of the usual B-factors, which goes back to the original 1949 paper by A. J. C. Wilson describing probability distributions of X-ray intensities.71−73 However, as R. H. Blessing and co-workers pointed out in 1996 on the basis of previous data, the Wilson plot captures mainly the contribution of atoms with lower B-factors, leading to a systematic underestimation of the actual B-factor distribution.74 Consequently, the Wilson B-factors tend to be lower than the “true” averaged B-factors of refined X-ray structures. Therefore, in the present review we do not specifically treat Wilson Bfactors.

2. IDENTIFYING AND INTERPRETING RIGIDITY, FLEXIBILITY, AND DYNAMICS IN PROTEINS WITH THE HELP OF B-FACTORS In sections 1.2 and 1.3 several representative studies featuring Bfactors as a means to identify flexibility and possibly internal motion in enzymes and more generally in proteins have already been discussed. Space does not allow all studies to be explicitly mentioned, selected ones being cited here without further analysis.91−101 A recent structural study of the halorhodopsin from Halobacterium salinarum deserves special mention because it touches on the problem of crystal packing effects.102 This membrane protein is a light-driven chloride pump in which a reversible photocycle is believed to be initiated by the all 13trans to 13-cis isomerization of the covalently bound retinal D

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

room-temperature X-ray data collection can be applied to many proteins to define conformational substrates linked to ligand binding, catalysis, and allosteric regulation.”105 It should be mentioned that at a protein glass temperature of 180 K, core packing is increased and motion decreased. A reviewer has also pointed out that cryo temperature increases the number of split occupancy side chains of amino acids, making the analysis of Bfactor variations difficult if not problematic; comparison of split occupancy at cryo with nonsplit conditions at room temperature is “not comparing like with like”. It is also interesting to note that A. Wlodawer and Z. Dauter recently published a critical crystallographer’s perspective on high-resolution cryo-EM maps and models.107 Clearly, more efforts are needed in this exciting research area, possibly aided by utilizing B-factors and flanked by NMR and MD studies. In sections 2.1−2.4, typical recent studies having quite diverse goals are featured in which B-factors play a role in the quest to identify and understand protein dynamics. It is hoped that these illustrative examples will inspire other researchers to consider Bfactors in their own future projects. This is followed by section 2.5 in which a table with additional examples is presented, which likewise illustrate the value of B-factors for a variety of different purposes. Protein engineering of thermostabilization based on the utilization of B-factors is treated thereafter in section 3.

chromophore. In earlier structural studies, packing contacts had prevented the investigation of possible protein conformational changes. In a novel new attempt, crystals were obtained using the vesicle fusion method, which opened the door for new insights. Using carefully performed B-factor analyses on the basis of normalized values, inter alia, it was found that large movements of two helices (E and F) during the photocycle are essentially unrestrained by packing effects and that the crystal lattice is not disrupted.102 Later this research was extended.103 In a different recent contribution, J. Li and co-workers utilized B-factors in a very different way.68 They stressed the importance of distinguishing between true protein interactions and crystal packing contacts, which is of obvious significance when developing reliable structural bioinformatics procedures.68 The problem of limited quality was addressed whenever mixed interfaces with differently sized contact areas occur in the training and test data. Three different B-factor related features were proposed for the classification between biological interfaces and crystal packing contacts: (1) Sum of the normalized B-factors of the interfacial atoms in the contact area; (2) average of the interfacial B-factor per residue in the chain; and (3) average number of interfacial atoms having negative normalized B-factor per residue in the respective chain. This approach to cross-data set classification appears to be superior to previous methods.68 The authors claim that their computational methods “have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes”.68 Finally, in an area of rapidly growing interest, R. Henderson and T. G. McMullan addressed the problems of obtaining highest quality images by single-particle electron cryomicroscopy in amorphous ice.104 A comparison of experimental images of apoferritin with simulated images was systematically made. Accordingly, the signal-to-noise ratio in the simulated images was reduced to different degrees by utilizing different Bfactors that influence the signal. The utility of B-factors in this research area let the authors conclude “that the experimental images still can be, and need to be, significantly improved to achieve higher resolution and to extend single-particle electron cryomicroscopy (cryo-EM) to smaller molecular assemblies.”104 Further cryo-EM studies with B-factor analyses are listed in Table 1 (section 2.5). J. S. Fraser and collaborators have addressed the crucial question whether cryocooling changes protein ensembles by examining the high-resolution X-ray data of 30 previously reported proteins at room and low temperatures.105 At the outset, it was not clear whether B-factors distinguish between reduced vibrational motions upon cryo-cooling (which takes up to 1 s) and a change in the overall conformational ensemble. Relevant is a theoretical analysis by B. Halle, who suggested that the cooling process is too slow to trap the room-temperature equilibrium.106 In the Fraser study, Halle’s dynamic quenching theory was considered, but it was shown that flash cooling biases structural ensembles which were previously hidden in protein crystals. New computational techniques for electron-density sampling, model refinement, and molecular packing analysis were applied to the data, demonstrating that the experimental procedure of crystal cryo-cooling remodels the conformational distributions of >35% of side chains and eliminates packing defects required for functional motions. The Fraser study suggests that “a combined strategy of ensemble analysis and

2.1. Identifying the Active Conformation of the Apo RORγt Nuclear Receptor

This recent study features an unconventional way to utilize Bfactors, which the authors call “differential B-factor analysis”.108 This approach was developed with the aim of identifying biologically active small molecules of interest in the pharmaceutical industry. The central goal was to define the active conformation of the nuclear orphan receptor RORγt.108 This receptor was previously identified as a master regulator of the Th17/IL-17 pathway, which is linked to the pathogenesis of autoimmune diseases. The crystal structures of apo- and ligandbound RORγt were reported, flanked by extensive NMR experiments in solution and QM computations of the type density function theory (DFT). It was unambiguously demonstrated that the apo form adopts an active conformation for binding coactivator peptides, while the structures of the ligand-bound RORγt suggest that binding of the inverse agonists disrupts the crucial interactions that stabilize helix H12. The destabilizing effect was in line with the novel B-factor analysis and deep-seated DFT computations.108 In most other studies, B-factors are usually considered when comparing different regions of the same protein. In this analysis, however, B-factors were compared in the same protein regions from two different structures. In order to ensure a meaningful comparison of Bfactors between the crucial H12 region in the bound and apo forms, a modified B-factor definition was introduced according to eq 3 B′ =

B− B B

median

median

(3)

where ⟨B⟩ denotes the median B-factor which is calculated separately for backbone and side-chain atoms, respectively. This means that a negative B′ value is indicative of higher rigidity than the average of the structure. Conversely, a positive B′ value is indicative of lower rigidity. For identifying regions of altered flexibility or stability in the compound-bound protein versus the respective apo state, eq 4 was suggested108 E

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews Bdiff = B′complex − B′apo

Review

the insight gained by considering B-factors. The B-factor analysis of the Fcε3−4 between domains and within Cε3 is summarized in Figure 2.115 Several conclusions were drawn, including the

(4)

If for a given atom Bdiff is positive then it signals elevated flexibility and is less stable than the apo state, while an atom characterized by negative Bdiff is further rigidified and more stable than in the apo state.108 The actions of three possible therapeutic drugs 1, 2, and 3 were considered (Figure 1).

Figure 1. Differential B-factor plots. Normalized B-factor differences (see main text) are displayed for the structural elements as a blue− white−red color ramp with blue indicating the most negative value (stabilization) and red the most positive value (destabilization). Compound chemical structure is drawn at the top of each respective panel. (A) H11 and H12 of RORγt in complex with compound 1. (B) H11 and H12 of RORγt in complex with compound 2. (C) H11 and H12 of RORγt in complex with compound 3. Reprinted with permission from ref 108. Copyright 2017 American Society for Biochemistry and Molecular Biology.

Without going into details here, surprising and insightful information was generated by this type of B-factor analysis (Figure 1).108 On the basis of these and other data, the authors conclude, inter alia, that subtle variations in drug structure can result in opposite functional responses of a nuclear receptor. The consequence is a complete functional switch from agonists to inverse agonists.108 In conclusion, this type of B-factor analysis can be expected to be of use in future drug discovery.

Figure 2. B-factor analysis of Fcε3−4 between domains and within Cε3. (A) Plot of normalized B-factors for Cε3 and Cε4 domains: CD23-bound structures (blue), FcεRI-bound structures (red), Fcε3−4 structures (lilac), IgE-Fc structure (teal), aεFab-bound (extended) IgEFc (pink), MEDI4212-bound Fcε3−4 (orange), artificially constrained Fcε3−4 structures (black), and omalizumab-bound IgE-Fc (dark pink). (B) Cε3 (red) and Cε4 (blue) domains. (C) Graph of normalized Bfactors for Cε4-distal and Cε4proximal regions of the Cε3 domains: colors as for panel A. (D) Cε4-distal (red) and Cε4-proximal (blue) regions of the Cε3 domains. Cε4 domains are colored gray. Reprinted with permission from ref 115. Copyright 2017 Elsevier.

2.2. Understanding Flexibility and Thermal Lability of the Cε3 Domain in Human Immunoglobulin E (IgE)

Numerous studies of human immunoglobulin E antibodies (IgE) have shown that they protect against parasitic infections at low concentrations, but at elevated levels in serum they play a central role in the molecular and cellular mechanisms of a number of allergic diseases including asthma.109 Several crystal structures of bound and unbound IgE fragments/domains have been analyzed,110−113 an example being the interpretation of the conformational flexibility in antibody effector domains.114 In a recent study, several additional crystal structures were provided with the aim of understanding the flexibility and thermal lability of the Cε3 domain in human IgE.115 It was known that interactions with its receptors, specifically FcεRI on mast cells and CD23 on B cells, are mediated by the Fc region, which is a dimer of Cε2, Cε3, and Cε4 domains. The N-linked glycosylation also occurs in this domain. The subfragment lacking Cε2 domains (Fcε3−4) likewise binds to both receptors. The new crystal structures of IgE-Fc and Fcε3−4 at unusually high resolution of 1.75 and 2.0 Å, respectively, revealed details of the carbohydrate and its binding characteristics in the protein domains. The authors analyzed the B-factors of these structures and of previous ones, which led to the conclusion that the Cε3 domains show the greatest intrinsic flexibility, this being most pronounced in their Cε4 distal regions. The highest degree of structural variation within the same Cε3 domain was also part of

realization that the stabilizing effect of the (Cε2)2 domain pair upon the Fcε3−4 region is relatively modest (Figure 2A). Finally, it was concluded on the basis of crystal structures, including B-factor analyses as well as thermostability data derived from differential scanning fluorimetric analysis of IgE-Fc and Fcε3−4, that Cε3 (Figure 2B) is the domain which is most susceptible to thermally induced unfolding. As in many other studies, it is the combination of B-factor analysis, X-ray structural data, and other techniques which leads to new insights. The lability effect identified in this way accounts for the long known characteristically low melting temperature of IgE, which in itself is an important conclusion. 2.3. Studying the Dynamic Allosteric Effect in αIIbß3 Integrin

The superfamily of human integrins comprises highly dynamic glycoproteins which are responsible for many different biological responses, including cell−cell or cell−matrix interactions.116 A specific example is αIIbß3 integrin, which mediates platelet aggregation and thrombus formation.117 Its expression and function is impaired in the hereditary disease Glanzmann thrombasthenia (GT). In a recent report, variants of GT in the F

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Calf-1 domain (residues 603−743) were studied by B-factor analysis and in silico techniques which included advanced MD simulations of seven GT variants.118 The most flexible region of the so-called Calf-1-domain, as shown by B-factors and rootmean-square deviations (RMSD), proved to be a rigid region encompassing two deformable zones. Whereas each mutated structure showed very little modification at the mutation site, remote conformational changes were in fact observed, a surprising result which calls for re-evaluation of the relationship between MD and allostery.118 The overall approach developed by the authors of this study constitutes an excellent technique for studying all αIIbß3 subdomains and identifying the influence of missense mutations at the local and global structural level.118 Protein blocks (PBs) were considered as a structural alphabet of 16 local prototypes. PB assignments were made for each residue of Calf-1 and over every snapshot obtained from MD simulations. The equivalent number of PBs (Neq) constitutes a statistical measurement representing the average number of PBs for a given residue at a defined position, which was calculated according to eq 5 16 ji zy Neq = expjjjj− ∑ fx ln fx zzzz j x=1 z k {

Figure 3. Comparison of the protein flexibility of Calf-1 by different metrics. 3D structures of Calf-1 domain represented through (A) Bfactor values, (B) root-mean-square fluctuation (RMSF) values, and (C) Neq values. Local structure is ranked from rigid (thin blue line, value of 0.0) to flexible (thick red line, value of 4.0). Residues with completely missing atoms are in gray in the B-factor cartoon (A). (D) Calf-1 amino acid sequence is placed with regard to its secondary structures assignment and to protein flexibility according to the B-factor, the RMSF, or the Neq values. Blue, green, yellow, orange, and red color scale the structure from rigid to flexible. Loops are loop 1 (size 9, positions 603−611), loop 2 (size 10, positions 620−629), loop 3 (size 7, positions 640−646), loop 4 (size 4, positions 653−656), loop 5 (size 8, positions 665−672), loop 6 (size 6, positions 678−683), loop 7 (size 6, positions 690−695), loop 8 (size 8, positions 708−715), loop 9 (size 11, positions 725−735), and loop 10 that begins at position 742. Reprinted with permission from ref 118. Copyright 2017 Springer Nature.

(5)

where f x is the probability of PB x.118 In this metric, an Neq value of 1 indicates that only one type of PB occurs, in contrast to a value of 16 which signals random distribution. High Neq values correlate with high flexibility. A comparison of the protein flexibility of Calf-1 by different metrics is summarized in Figure. 3.118 It can be seen that according to the B-factor analysis, loops 2−5 are the most flexible, residues 622, 643, and 710 and residues 667−668 having the highest B-factors. Note that flexible regions are graphically highlighted by “B-factors putty”, which can be generated by application of the Pymol program.119 In this investigation, crystal packing effects appear not to play any role, although as already noted above (sections 1.1 and 1.2), their influence on B-factors has been postulated in other studies.5,12,120 Mobility of each residue was also ascertained by rootmean-square fluctuations (RMSFs), which were found to correlate generally with the B-factor analysis, certainly for loops 2, 5, and 8 but not for loop 3 which binds calcium (not part of MD). Correlations between B-factors and RMSF values were reported previously in other cases, as in a subsequent study.121 A good correlation between RMSF and Neq (and therefore with B-factors) was also found, high Neq values indicating mobility. The results show, inter alia, that a locally rigid amino acid stretch with low Neq can in fact be part of a rather mobile loop participating in the global structural motions of the protein.118

reader is referred to the original study,122 which includes crystal structures of the best variants appearing along this particular evolutionary pathway, deuterium kinetic isotope effects, docking, and extensive kinetics, as well as NADH-cofactor release data. Suffice it to say that relaxation of nonproductive substrate binding and increased rate of NADH release correlate with higher overall activity and reversal of enantioselectivity in favor of the (R)-product in the asymmetric reduction of acetophenone.122 In this study122 some of the best mutants along the ISM-based upward climb were found to be A2 (Y294F/W295A), A2C2 (Y294F/W295A/F43H) H39Y(additional), A2C3 (Y294F/ W295A/F43S) H39Y(additional), and A2C2B1 (Y294F/ W295A/Y54F/F43H) H39Y (additional). The respective mutational effects were interpreted by considering the average B-factors derived from the respective X-ray structures (Figure 4).122 Two types of residues were shown to have increased Bfactors, those that were mutated and those that are spatially nearby but not mutated. One of the central conclusions was the realization that increased flexibility around the active site of this ADH leads to higher activity. This study shows that in the quest to gain initial mechanistic insights it is rewarding to study changes in B-factors in enzyme variants generated by directed evolution. However, this is only the first step since the actual molecular effects causing increased thermostability remain to be investigated.

2.4. Understanding Increased Activity of a Mutant Alcohol Dehydrogenase Evolved for Reversed Stereoselectivity

B-factors have not played a role in the area of protein engineering of stereoselective enzymes. However, an interesting exception appeared recently in which directed evolution based on iterative saturation mutagenesis (ISM; see section 3.1) was applied to the alcohol dehydrogenase ADH-A from Rhodococcus ruber DSM 44541.122 Some of the key results were interpreted with the help of B-factors. Three randomization sites for saturation mutagenesis (SM) were chosen, A (Y294/W295), B (Y54/L119), and C (F43/I271), SM then being performed according to the ISM pathway A → C → B. For details, the G

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

some insights can be gained by considering B-factors.9 They even predicted that certain mutational changes at residues displaying high B-factors and therefore high flexibility could result in thermostabilization, although such “rational” laboratory experiments were not performed at the time. More recently, attention has been paid to engineering flexible loops for a variety of different mechanistic and structural purposes as critically delineated in a viewpoint article by B. Hauer et al. with focus on practical applications.210 Although hundreds of studies featuring protein engineering of thermostability have appeared, distinguishing between thermodynamic and kinetic stability was often not an issue. The reader is referred to a comprehensive review by Sanchez-Ruiz.211 Stability is a concept pervasive in both biological and nonbiological systems.212 A protein’s stability may range from nonexistent (e.g., intrinsically disordered proteins) to very high, as indicated by its resistance to degradation under relatively harsh conditions.213 Typically, two types of protein stability are discussed in the literature: (1) Thermodynamic stability involves global unfolding and is related to an ensemble of more or less unfolded (U) states in the equilibration with the native, functional protein (N); (2) Kinetic stability correlates with a particular free energy barrier separating the N from the U forms.211 Accordingly, thermodynamic stability can be defined by ΔGU (the difference in Gibbs free energy between the native and the denatured states) (Figure 5), while kinetic stability is evaluated by their activation energy for unfolding (ΔGU⧧, Figure 5). A protein is usually in equilibrium between the N states and the U states (N ⇄ U, K= [U]/[N], K is the equilibrium constant). Under physiological conditions, the equilibrium greatly favors the folded state (K < 1), but it shifts toward the unfolded state at harsh conditions, e.g., high temperatures, extreme pHs, and high concentrations of denaturants (K > 1).214 The standard unfolding free energy change can be represented by the following equation

Figure 4. B-factor analysis of WT ADH-A and variants thereof evolved by iterative saturation mutagenesis (ISM). (A) Averaged B-factors derived from crystal structures, mutational residues being marked by F43, Y54, L119, I271, Y294, and W295. (B) Regions of higher B-factors in variant A2 shown as surface representation. A2 subunits (chains C and F in 508q) are pictured in chartreuse and dark green, respectively, and cofactor NAD+ in green. (C) Close-up of the same regions in A2 as surface representation, which combine to form one face of the active site; NAD+ again in green, mutations Y294F and W295A in magenta sticks with their VdW radii indicated by dots. Reprinted with permission from ref 122. Copyright 2017 John Wiley and Sons.

2.5. Compilation of Further Recent Studies Utilizing B-Factors for Identifying and Interpreting Protein Flexibility and Internal Motion

ΔG U = G U − G N

In order to illustrate further how experimental B-factors serve as easily accessible and useful structural parameters for identifying and/or interpreting protein dynamics in quite diverse experimental setups with different goals, we list in Table 1 selected studies taken from the recent literature.

(6)

or in the Lewis equation form ΔG U = − RT ·ln K

(7)

Therefore, there may be two ways to increase the thermodynamic stability of a protein by protein engineering. One is to stabilize the folded state by addition of favorable interactions or by release of unfavorable interactions; the other is to destabilize the unfolded state by decreasing the chain entropy of the unfolded state.214 To date, many studies have focused on the molecular basis of protein thermodynamic stability, which can be easily quantified in vitro for small model proteins.200−205 The influence of mutations has been examined experimentally and computationally.200−205,211 In contrast, kinetic stability has not been studied as systematically.211−216 An intriguing example concerns a small protein lacking disulfide bonds, dubbed ThreeFoil, which on the basis of a bioinformatics and energy function was designed to have a symmetric superfold with repeating multivalent carbohydrate-binding elements.215 Using experimental and computational analyses, it was demonstrated that ThreeFoil has unusually high kinetic stability. Special methods or models were developed to uncover and understand kinetic stability for specific proteins, such as the cold shock protein (TmCSP),217 Subtilisin E,218 a lipase from T. lanuginose,219 α-chymotrypsin (α-CT),220 triosephosphate isomerases,221 and Hen-Egg

3. UTILIZATION OF B-FACTORS FOR ENGINEERING ENZYME THERMOSTABILITY AND ROBUSTNESS TOWARD HOSTILE SOLVENTS Explaining stability or instability on a molecular level for all proteins is often a challenge, and predicting multiple mutations that lead to enhanced stability is even more difficult. There are many examples in the literature which show that a very little number of sequence changes can cause pronounced differences in stability.200−205 As pointed out by a reviewer, this means that statistics must be viewed with extreme caution. Studies of (hyper)thermophilic proteins indicate the reasons for their extreme robustness, which include intramolecular H bonds, salt bridges, disulfide bonds, increased compactness, shortening of loops, higher hydrophobicity, and decreased flexibility of αhelical segments.206−209 In these contributions, B-factors were generally not considered. Nevertheless, it is logical that parallel and subsequent protein engineering studies aimed at increasing thermostability of mesophilic proteins focused on these factors. In a short but inspiring article on protein stability, S. Parthasarathy and M. R. N. Murphy emphasized that at least H

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

I

X-ray

tryptophan-containing proteins (albumins, cyclophilin A, azurin, phospholipase A2, and ribonuclease T1) backbone fragments

X-ray X-ray X-ray

Streptomyces avidinii Psychromonas ingrahamii H. sapiens

cold-adapted frataxin

complexes of thrombin

X-ray X-ray

cryo-EM

low-immunogenic core streptavidin (LISA-314)

fatty acid synthase (GroEL) and the transient receptor potential channel (TRPV1) 14 protein kinases protein−protein complexes

X-ray X-ray

crystallographic waters hundreds of proteins

cryo-EM

X-ray

hundreds of proteins

X-ray homology X-ray

X-ray X-ray

X-ray X-ray X-ray X-ray X-ray X-ray

X-ray X-ray

human, rabbit, equine

Geobacillus stearothermophilus

E. coli

structures cryo-EM and X-ray X-ray X-ray X-ray X-ray

domain-swapped dimer hundreds of proteins

binding-site water 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase (HPPK) more than 1000 proteins lipase T6 bounded complexes

Homo. sapiens Yarrowia lipolytica Streptomyces sp. 9

Plasmodium falciparum strain FVO

apical membrane antigen over 700 protein−DNA complexes nine viral proteins hundreds of crystal structures data set of proteins human P450 family-1 lipase (Lip2) ß-1,4-xylanase thousands of protein chains recombined nucleosome core particles dozens of protein complexes

organism Escherichia coli

dihydrofolate reductase

objectives

Gaussian network model; B-factors of kinase catalytic domain of active versus inactive conformations protein−protein interactions in bound and unbound states; lower flexibility in bound state as shown by B-factors X-ray structure of streptavidin mutant with low immunogenicity relevant to cancer research; B-factor around mutated site indicates no conformational change structural characterization of a frataxin-protein relevant to neurodegenerative disease; analysis of B-factors and RMSD upon metal binding binding cooperativity in thrombin inhibitors; analysis of B-factors of ligand groups inside binding pocket indicates reduction of mobility of side chain

B-factor analysis and computations of loop-flexibility of apical membrane antigen (AMA1) focus on local dynamics and B-factors B-factors and entropy effects; Rosetta software classification of protein binding interfaces by B-factor-related features; limitations limitations and pitfalls when using B-factors B-factors at active sites; comparative proteomics among P450 family-1 analysis of lipase lid closure mechanism by B-factor consideration and dynamical cross-correlations B-factor analysis before and after mutagenesis of a thermostability-engineered ß-1,4-xylanase predictor of residue flexibility indicated by B-factors; relative solvent accessibility representation of B-factor-increased nucleosomal DNA regions upon H4 tetra-acetylation predictor of protein binding hot spots using individual atomic contacts and co-occurring contacts guided by B-factors binding site water replacement in docking; predictive rate highest when crystal water has low B-factor revealing a network of conformational transitions in HPPK apoenzyme; flexibility of three loops by B-factor analysis correlation between conformational entropy and B-factors; few false positives structural insights into methanol-stable lipase variants; lower B-factors at mutation sites QM/MM study of bound versus unbound conformations of drugs in targets; constrained minimization using B-factors and Knee Point Detection (KPD) MD as aid in prediction of domain swapping for mutants suggests increased B-factors in hinge loop flexibility assessment by multiscale Gaussian network model (mGNM) and anisotropic network model (mANM); comparison of predicted and experimental B-factors capturing protein multiscale thermal fluctuations by flexibility-rigidity index (mFRI) as an alternative to GNM; prediction of B-factors in 364 proteins theoretical SZMAP-based study of water replacement in proteins by drugs; agreement with B-factors protein dynamics relevant to functions by elastic network models (ENMs) which considers crystal effects and lumped masses and specific spring constants; better B-factor correlations in >500 proteins correlation of phosphorescence lifetimes of single tryptophan and flexibility; B-factors of tryptophan and nearby residues Rosetta-based refining protein structures from cryo-electron microscopy maps; real-space B-factor fitting Improving visualized cryo-electron microscopy density reconstructions; B-factor sharpening

model incorporating B-factors predicts backbone and side-chain dynamics of dihydrofolate reductase

comment

149

148

147

145 146

144

143

142

140 141

139

137 138

134 135 136

132 133

124 125 126 68 68,69 127 128 129 75 130 131

123

ref

Table 1. Recent Studies Utilizing B-Factors for Identifying and Interpreting Protein Flexibility and/or Internal Motion Selected from the Period 2014 to Early 2018, in Addition to Those Highlighted in Sections 2.1−2.4a

Chemical Reviews Review

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

objectives

organism

J

Sin recombinase 2′-deoxyribosyltransferase

few proteins

L-asparaginase

dozens of proteins Mopeia virus exonuclease domain amino acid sequences

hundreds of proteins

Staphylococcus aureus Bacillus psychrosaccharolyticus

Thermococcus kodakarensis

Mopeia virus

X-ray; electron microscopy X-ray X-ray

X-ray X-ray X-ray; homology X-ray

cyro-EM and X-ray X-ray

E. coli

β-galactosidase

X-ray X-ray

NMR

X-ray; homology NMR X-ray and NMR X-ray X-ray

X-ray X-ray

X-ray; cryo-EM X-ray X-ray X-ray X-ray X-ray X-ray

X-ray

X-ray

Escherichia virus T4 H. sapiens

structures X-ray

ligands in PDB

T4 lysozyme (T4L) human apo- and holo-CRBP1

dozens of proteins

trypsin and ligands hundreds of protein complexes

hundreds of proteins glucoamylase Aspergillus niger

H. sapiens

hundreds of proteins cellular retinol-binding protein 1 (CRBP1)

hundreds of proteins

Campylobacter jejuni Zymomonas mobiliz

H. salinarum uncultured bacterium

Bacillus licheniformis; Bos taurus; Carica papaya; Canavalia ensiformis, etc. B. taurus; H. sapiens

hundreds of proteins halorhodopsin carboxylesterase (EstSRT1) 2000 nonhomologous protein crystal structures sialyltransferase (CstII) tRNA-guanine transglycosylase (TGT)

multiple proteins

proteinase inhibitor (trypsin); serine proteases

12 diverse proteins

Table 1. continued comment

snapshots of molecular swivel in action; B-factors lower in disordered region 2′-deoxyribosyltransferase BpNDT X-ray structure with high B-factors in loops suggesting high mobility; synthesis of FDA-approved drugs

evaluation of overall scale of force constants in the elastic network model (ENM); importance of internal and rigid-body motions in B-factors machine learning for predicting factors for resistance to asparagine deamination; B-factor analysis metal swapping in exonuclease; low B-factors for Mg and Ca ions improving molecular replacement for solving phase problems in protein X-ray structures; role of B-factors thermostable L-asparaginase in leukemia treatment and food industry; B-factor analysis of thermophilic versus mesophilic L-asparaginases multiscale virtual particle-based ENM in polio virus structures; good B-factor prediction, but mismatches in areas of high B-factors

MD that includes interaction entropy is better in B-factor and RMSD computations protein−protein interactions; minimum covariance determinant method with theoretical B-factors as one of many determinants interesting elastic network model (ENM) study; fluctuation matching targeted for experimental B-factors in lysozyme correlation of dynamics derived from electron spin resonance (ESR) data and B-factors interaction of retinol with human CRBP-1; B-factor analysis shows that α-helix II and EF-loop are most flexible modeling ligand binding of ensemble of bound and unbound states; computed B-factor ratio of ligand to surrounding residues use of B-factors for sharpening electron microscopy maps and getting charge density from them

flexibiliy−rigidity index (FRI) for thermal fluctuation analysis; experimental versus predicted B-factors halorhodopsin with depopulated Cl-binding site and structural changes; B-factor changes greater substrate acceptance correlates with B-factors at active site discussion of B-factor distribution and entropy correlation between B-factors and MD computations in a sialyltransferase binding ability of tRNA-modifying enzyme with ordered and disordered side chains supported by B-factor analysis; drug design prediction of protein−protein interaction sites; B-factor analysis conformational changes by ligand binding in human retinol-binding protein; B-factors indicate higher rigidity upon binding computer-based ResQ for unified estimation of B-factors and residue error in protein structure prediction prediction of backbone dynamics; B-factor analysis starch-binding domain of glucoamylase; higher B-factors at residues surrounding the disulfide bond

metrics for protein backbone and side-chain flexibility; local dynamics and normalization with B-factors MD flexible fitting paradigm in cryo-electron microscopy; B-factors

proteins in different 2-phase ionic environments; peculiarities of surface structures and B-factors provide insight in protein transport

ref

175 176

174

173

170 171 172

169

168

167

165 166

164

162 163

160 161

31

158 159

153 103 154 155 156 157

152

151

150

Chemical Reviews Review

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

K

a

X-ray X-ray X-ray X-ray

X-ray X-ray NMR X-ray

This table is not meant to be an exhaustive list; its purpose is to list representative studies.

tyrosine kinase inhibitor AG1478 and its cognate kinase peptide amidase cyclophilin 40 CheY-like proteins (CheY, NT-NtrC, and Spo0F) Stenotrophomonas maltophilia B. taurus E. coli, Salmonella typhimurium, Bacillus subtilis

Influenza A virus H. sapiens Thermochromatium tepidum Arabidopsis thaliana Spinacia oleracea Caldalkalibacillus thermarum Methanothermobacter thermautotrophicus human herpes virus 8 H. sapiens Drosophila melanogaster Pseudomonas aeruginosa, Acinetobacter baumannii, Klebsiella pneumoniae

B2 cap binding domain (PB2cap) integrin LH1−RC complex transceptor NRT1.1 PsbS and CP29 NDH-2 complexed with HQNO geranylgeranylglyceryl phosphate synthase

UNG−dsDNA L1ORF 1p NR5A3 three novel diazabicyclooctanes

X-ray

Pseudomonas diminuta

phosphotriesterase/Arylesterase X-ray X-ray X-ray X-ray X-ray X-ray homology

X-ray X-ray homology

Caenorhabditis elegans Ascaris suum five genera of marine molluscs

structures X-ray

organism Nostoc ellipsosporum; H. sapiens; E. coli

objectives

cyanovirin-N; calcium-loaded calmodulin; ribose-binding protein; maltodextrin-binding protein and a molecular chaperone U-box ubiquitin ligase (UFD-2) two thiolases Acat2 and Acat5 cytosolic malate dehydrogenases

Table 1. continued comment

B-factors of both protein and DNA increase in B-factors of two crystallized trimers of L1ORF 1p-coiled coil low B-factors for alpha6 helix as evidence against significant disorder cntibiotics against resistant pathogens; low B-factors for inhibitors indicating full occupancy in binding pocket B-factors correlate with red edge excitation shift (REES) measurements; drug occupancy and B-factors combination of FRESCO, Rosetta, and B-factors MD-based analysis indicates correlation between B-factors and melting temperature of K308A mutant demonstration that the GNM-predicted B-factors compare well with the experimental data by using the PFM approach

B-factor helps to illuminate structural flexibility high B-factors indicate low occupancy and multiple orientations of CoA mobility differences noted through RMSF analysis mirror those obtained through measurement of B-factors B-factor for evaluating the conformational rearrangement; lower B-factor indicated a well-ordered, high-occupancy, and closed conformation exploring residues with increased flexibility or rigidity based on B-factors B-factor analysis revealed differences in loop dynamics between α1-I and α2-I B-factors of terminal regions indicate flexibility B-factors change upon nitrate binding B-factors increase by 78% upon minor monomer complexation relative to chain C, chain B structure has lower B-factors for protein and ligands residue-specific root-mean-square fluctuations (RMSF) read as B-factors; study of dynamic behavior

B-factor-based robotics-inspired algorithm for exploring protein conformational space

ref

192 193 194 195−199

188 189 190 191

182 183 184 185 186 187 187

181

178 179 180

177

Chemical Reviews Review

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

bonds has been quite successful, the early work of M. Matsumura and B. W. Matthews,228 S. Kanaya et al.,229 and J. Clarke and A. R. Fersht230 being typical examples; other studies focusing on disulfide formation were reviewed by S. F. Betz.227 Mutations that induce additional hydrophobic core packing also increase stability.231,232 These and other rational approaches such as designed H-bond bridges or salt bridges on protein surfaces have provided excellent results.226 It thus seems that engineering thermostability of enzymes is easier than enhancing or reversing stereoselectivity or controlling substrate acceptance.77 However, the degree of improvement in stability by rational design is not always satisfactory. Consequently, some scientists have turned to directed evolution as an alternative, with fundamental and practical reasons being the motivation. Current research indicates that the fusion of rational design and directed evolution appears to be most successful,200−205 an approach that was actually first practiced by J. A. Wells and coworkers in 1985.233 In that unprecedented study, the oxidative stability of subtilisin against H2O2 was improved by rationally choosing residue Met222 as a randomization site for saturation mutagenesis (SM) with the aim of replacing oxygen-sensitive methionine by one of the other 19 canonical amino acids. Several mutants showed the desired effect, including variant Met222Ala, although with a significant trade-off in activity.233 Today, rationally chosen amino acid substitutions at hot spot residues for increasing thermostability, based on B-factor algorithms in combination with other programs, e.g., Foldx, RosettaDesign, proline theory,234 and structure-guided consensus approach, constitute a powerful strategy. Nevertheless, it must be remembered that energy-based computations for predicting protein stability are likely to provide best results when aiming to improve thermodynamic stability, whereas the scenario in the case of kinetic stability is much more difficult. Table 2 contains a list of selected studies in which B-factors form the basis of rationally enhancing thermostability of proteins.

Figure 5. Free energy diagrams illustrating the two types of protein stability: (A) thermodynamic stability and (B) kinetic stability. Reprinted with permission from ref 213. Copyright 2017 American Chemical Society.

lysozyme.214 A general and reliable approach has yet to be developed. Before analyzing examples of rational design and of directed evolution of protein thermostabilization, it is instructive to consider protein engineering methods in general (section 3.1). 3.1. Short Overview of Protein Engineering Methods

Rational protein design222−224 as a form of protein engineering, based on carefully chosen site-specific mutagenesis, has long been employed as a means to increase the thermostability and robustness of proteins.225−227 The results up to 2004 were succinctly summarized by V. G. H. Eijsink and co-workers,226 with emphasis on the different types of molecular effects which are still relevant today in ongoing rational design of protein thermostability. Ground-breaking research led to the realization that enhanced rigidity results in protein robustness, induced by mutations that introduce new intramolecular H bonds, salt bridges, and/or disulfide bonds. Rational design of disulfide

Table 2. Selected Recent Examples of Rational Engineering of Protein Thermostability Using B-Factor Based Mutational Design (sometimes using the B-FITTER computer aid) strategies

enzymes

B-factor/Discovery Studio

Aldolase (DERASEP)

B-factor/RosettaDesign/packing analysis MODIP and DbD for disulfide design with B-factor MODIP and DbD with B-factor

lipase B lipase B polysaccharide monooxygenase (LPMO10C) sucrose isomerase (AS9 Pall) (R)-selective amine transaminase salicylic acid binding protein 2 (SABP2) pullulanase

organisms

mutations

Staphylococcus epidermidis Candida antarctica C. antarctica

T120C, G174I, and G213C R249L A162C-K308C (disulfide bond) A143C-P183C, S73C-A115C

Streptomyces coelicolor Serratia plymuthica AS9 Aspergillus terreus

E175N/K576D

Nicotiana tabacum Bacillus acidopullulyticus

A189S

chondroitinase ABC I

Proteus vulgaris

E138P

B-factor/RosettaDesign

xylanase

Bacillus circulans

N52Y

B-factor/B-FITTER

xylanase (AoXyn11A)

Aspergillus oryzae

B-FITTER/WHAT IF/Rosetta

firefly luciferase

Photinus pyralis

substitutions of the core region 93 GTYNPGSGG101 D474K/D476N

B-factor/RosettaDesign B-factor/FoldX B-factor/FoldX B-factor/proline theory/PoPMuSiC-2.1/sequence consensus approach B-factor/Proline theory

T130M/E133F

E518I/S662R/Q706P

L

results

ref

showed surprisingly high tolerance to acetaldehyde Tm was increased to 56.8 °C compared with WT (54.5 °C) by DSF T5060 was increased by 8.5 °C compared with WT (54.5 °C) combination of these two pairs of disulfide bonds displayed 12 °C increase in Tm

235

half-life was 7.65 times better than wild type at 45 °C half-life with 3.3-fold improvement at 45 °C

239

236 237 238

240

half-life was increased by a factor of 3.4 at 60 °C 11-fold half-life improvement at 60 °C and a 9.5 °C increase in Tm

241

E138P variant increased t1/2 and Tm to 18 min and 50 °C (wild type 48 °C), respectively 5-fold improvement in half-life over the wild type at 50 °C optimum temperature increased by about 15 °C; half-life with 4.6-fold improvement at 50 °C destabilization of D474 K mutant

243

242

244 245 246

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

ment continues to stand at the heart of directed evolution research. As already delineated, SM involves amino acid randomization at a defined site, composed of either a single residue (in this case often called site-selective mutagenesis) or more than one residue, in which case combinatorial selections are theoretically possible, e.g., many different double or triple mutants. It can be applied to manipulate very different protein parameters, including stereo- and regioselectivity, thermostability and activity. When focusing on activity, stereoselectivity, and/or regioselectivity, the combinatorial active-site saturation test (CAST) has emerged as a generally successful strategy.284−288 The term CAST is a useful acronym to distinguish randomization at the binding pocket from SM at remote sites for other purposes such as thermostabilization (B-FIT). It involves the systematization of previous sporadic examples of SM at sites lining the binding pocket.289 CAST is a convenient acronym for distinguishing between SM near the active center for activity/selectivity and SM at remote sites for other purposes such as thermostabilization. In the CAST generalization, a site can be composed of 1, 2, 3, or more amino acid positions (residues) (Figure 6A).

Directed evolution of enzymes resulting in enhanced thermostability and robustness toward hostile organic solvents200−204 has helped to eliminate a few traditional limitations of enzymes as catalysts in organic chemistry and biotechnology. As already pointed out, thermodynamic and kinetic stability need to be considered. Some researchers in directed evolution205 and in rational design226 of protein robustness have indeed emphasized the distinction between thermodynamic stability, which refers to resistance to unfolding (as given by ΔG = ΔH + ΔS), and kinetic stability, which correlates with resistance to degradation and maintained reaction efficiency (activity).205,226 With a view on practical applications, it was stated early on that “for most industrial enzymes the only stability parameters that are relevant and assessable relate to kinetic (not thermodynamic) stability”,226 an opinion that accompanies more recent work in directed evolution as well.205 However, this may be debatable. Directed evolution involves gene mutagenesis, expression, and screening (or selection), generally in recursive cycles for stepwise maximizing thermostability or other properties. In an early contribution by H. Liao, T. McKenzie, and R. Hageman, the use of a mutator strain for random mutagenesis in the stepwise (two-cycle) enhancement of thermostability of kanamycin nucleotidyl transferase by 23 °C was readily achieved.247 The clearly Darwinian character of this process was not generalized at the time to include other enzymes. A key contribution is due to F. H. Arnold and co-workers, who showed that multiple cycles of error-prone polymerase chain reaction (epPCR) as applied to the protease subtilisin E induce enhanced tolerance to a hostile organic solvent (dimethylformamide).248 Importantly, when wild-type (WT) and best mutant were used in standard aqueous conditions lacking the organic solvent, activities were identical. This means that activity under standard aqueous conditions was not increased. The Arnold group also evolved thermostability using epPCR followed by saturation mutagenesis (SM) at the identified hot spots.249 Indeed, epPCR, SM, DNA shuffling, or combinations thereof were shown to be successful in boosting the thermostability of many different types of enzymes.200−202 For protein thermostabilization, these approaches are still being used today.203,250−260 In many but not all studies, gains in robustness have essentially no effect on activity or stereoselectivity. However, when attempting to improve all three parameters, trade-offs occur, an unfortunate phenomenon that is fairly common in directed evolution or rational design, irrespective of the mutagenesis strategy (see section 3.3.6 for more details). It often prevents industrial applications. Relevant is the discussion on the correlation between protein robustness and evolvability.261−269 For practical and fundamental reasons, thermostability should not be ignored when evolving stereo- and/or regioselectivity, although in the past this has often happened. The bottleneck is the screening step for assessing stereoselectivity because such analytical tools as surface displays, microfluidic devices, and fluorescence-activated cell sorting (FACS) cannot be used to determine this important parameter.270−273 Therefore, the development of methods for the creation of small high-quality (“smart”) mutant libraries is crucial for the advancement of directed evolution, as emphasized in several reviews.274−283 As will be seen in sections 3.2.1−3.2.2 and 3.3.1−3.3.9, this also applies to directed evolution of enzyme robustness based saturation mutagenesis (SM). Indeed, methodology develop-

Figure 6. (A) Systematization of sites lining the binding pocket for saturation mutagenesis (SM) in the quest to enhance activity and improve or reverse stereo- and/or regioselectivity (CASTing).287,288 (B) Illustration of iterative saturation mutagenesis (ISM) that can be used in CASTing for manipulating stereo- and/or regioselectivity and in the B-FIT method for increasing thermostabilization (see sections 3.2 and 3.3).76,77 Reprinted with permission from ref 296. Copyright 2016 John Wiley and Sons.

If the initial SM libraries contain improved enzyme variants but the degree of improvement in terms of activity, selectivity, and/or thermostability does not yet suffice for practical applications, iterative saturation mutagenesis (ISM) can be applied.287,288 Accordingly, the best mutant in a given library is used as a template for SM at another site and so on. In the case of two randomization sites A and B, two different evolutionary pathways are theoretically possible. When grouping individual residues into three such sites A, B, and C, 6 pathways are possible, and in the case of four sites A, B, C, and D, 24 M

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

trajectories are relevant (Figure 6B). CAST/ISM has been applied in numerous studies aimed at enhancing or reversing stereo- and regioselectivity as well as increasing activity.287,290,291 The user-friendly computer aid CASTER is available free of charge on the Reetz homepage.292 It contains straightforward instructions on how to calculate the degree of oversampling necessary for 95% or any other % coverage of a given SM library, based on the Patrick/Firth algorithm.293,294 It assumes the absence of amino acid bias, which in reality always occurs. This traditional problem has recently been addressed by implementing high-fidelity on-chip solid-phase gene synthesis followed by efficient gene assembly for combinatorial mutant library construction.295 This novel advancement has yet to be applied to the thermostabilization of enzymes. The amount of oversampling for 95% library coverage increases astronomically as the number of residues in a randomization site increases beyond 2 or 3 irrespective of the catalytic parameter to be engineered, especially in the case of NNK codon degeneracy encoding all 20 canonical amino acids (Table 3, left).287 For this reason, reduced amino acid alphabets

Scheme 1. Two Different Strategies for Applying Reduced Amino Acid Alphabets as Combinatorial Building Blocks in SM As Defined by Rationally Chosen Codon Degeneracies298,299,301−303a

Table 3. Oversampling Necessary for 95% Coverage as a Function of NNK and NDT Codon Degeneracy (assuming the absence of amino acid bias)287 a

most cases not more than about 3000 transformants need to be screened in a given project. The primary focus in this part of the review is on SM/ISM, which has been utilized many times as a means to enhance the robustness of proteins according to the B-FIT approach.76,77 This requires the identification of appropriate (“hot”) residues, which are chosen on the basis of structural information provided by B-factors and/or computational guides. They can be grouped into multiresidue randomization sites, analogous to CASTing for enhanced stereoselectivity (Figure 6), followed by respective SM/ISM experiments, but single-residue sites can also be used. In order to identify the “hot” residues, the most flexible residues are first identified as shown by the highest B-factors provided by X-ray data.76,77 In practice, B-FITTER, a user-friendly computer aid available free of charge on the Reetz homepage,292 automatically averages the B-factors of the atoms of all residues (amino acids) and ranks them from highest to lowest Bfactors.77 Since screening may be the bottleneck in a given case, the numbers problem in directed evolution can be addressed by using only a limited number of randomization sites when applying ISM in the B-FIT approach. In such cases residues characterized by the highest B-factors are chosen. If the crystal structure is not (yet) available, consensus data and/or molecular dynamics (MD) computations, revealing highest points of flexibility, can be utilized.76,304−306 However, this is less reliable. As already pointed out in the Introduction, B-factors have also been calculated with reasonable reliability,32 but they should nevertheless be viewed with caution. Guidelines for applying ISM in general have been issued, be it for enhancing thermostability or manipulating stereoselectivity.76,77,307,308 Analyses of several typical B-FIT studies follow in sections 3.2.1−3.2.2. It will also be shown that multiparameter genetic optimization incorporating enantioselectivity, activity, and thermostability is possible in a simultaneous manner,303 which differs fundamentally from the traditional sequential approach.274,276−283,309 In the vast literature on protein engineering of thermostabilization, different procedures for measuring improved thermostability relative to WT have been used by different groups: Half-lives at elevated temperatures, Residual activities following heat treatment,

NNK

a Reprinted with permission from ref 298. Copyright 2016 John Wiley and Sons.

NDT

no. of amino acid positions at one site

codons

transformants needed

codons

transformants needed

1 2 3 4 5 6 7 8 9 10

32 1028 32 768 >1.0 × 106 >3.3 × 107 >1.0 × 109 >3.4 × 1010 >1.0 × 1012 >3.5 × 1013 >1.1 × 1015

94 3066 98 163 >3.1 × 106 >1.0 × 108 >3.2 × 109 >1.0 × 1011 >3.3 × 1012 >1.0 × 1014 >3.4 × 1015

12 144 1728 20 736 248 832 >2.9 × 106 >3.5 × 107 >4.2 × 108 >5.1 × 109 >6.1 × 1010

34 430 5175 62 118 745 433 >8.9 × 106 >1.1 × 108 >1.3 × 109 >1.5 × 1010 >1.9 × 1011

a

Reprinted with permission from ref 287. Copyright 2011 John Wiley and Sons.

were introduced, meaning the utilization of less than the 20 canonical amino acids as building blocks.76,77,284,287,288,296−298 One of many variations is NDT codon degeneracy encoding 12 amino acids (Phe, Leu, Ile, Val, Tyr, His, Asn, Asp, Cys, Arg, Ser, Gly) (Table 3, right).287 Even single or triple codes involving only one or three semirationally chosen amino acids as combinatorial “building blocks”, respectively, can be chosen.299−302 Two strategies for applying different codon degeneracies can be considered when applying SM for optimizing stereo- and regioselectivity287,288 or, as will be shown below, enhancing thermostability:76,77 (1) Assign one and the same codon degeneracy corresponding to a defined reduced amino acid alphabet as combinatorial building blocks for SM at a multiresidue randomization site or (2) assign a different codon degeneracy at each position of a multiresidue randomization site also in a single SM experiment (Scheme 1).298,299,301−303 In both cases adequate criteria for choosing codon degeneracies must be considered, but this applies all the more to strategy 2 which requires more specific local information.298 If necessary, ISM can then be invoked. In N

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Melting temperatures Tm determined by differential scanning microcalorimetry (DSM) or circular dichroism (CD), Temperatures at which 50% activity is lost following a defined time lapse as given in x min in the form of T50x. Whenever melting temperatures (Tm) are used to characterize stability, researchers should state whether the data is derived from reversible or nonreversible processes (which unfortunately is not routinely done). In the case of nonreversible processes, the term “apparent Tm values” is appropriate. The fact that many different techniques for measuring thermostability are commonly used makes it difficult to compare two or more protein engineering approaches, especially when different enzymes are used in the respective model studies. When assessing the stability of a mutant protein with that of its WT, the Gibbs free energy (ΔΔG‡) of activation for activity loss is also employed by some researchers, which is based on the ratio of the respective two half-lives241 according to eq 8 ΔΔG‡ = RT ln(t1/2 of mutant/t1/2 of wild type)

Figure 7. Eight randomization sites in Lip A chosen for saturation mutagenesis (SM).76 Picture displays a structural model based on the crystal structure of the lipase:312 Library A (dark blue), library B (red), library C (dark green), library D (violet), library E (brown), library F (light blue), library G (light green), library H (yellow). Ser77 is the catalytically active residue undergoing nucleophilic addition to the ester carbonyl function. Reprinted with permission from ref 76. Copyright 2006 John Wiley and Sons.

(8)

Scheme 2. Results of ISM in the Stabilization of the Lipase from B. subtilis (Lip A) Using the B-FIT Methoda

where R = gas constant and T is the temperature at which the half-life t1/2 was measured. In those cases in which irreversible defolding is involved, aggregation and precipitation can be expected, irrespective of the mutagenesis method. As shown in the following sections, BFIT is a viable method, but it is not the only option for making proteins more robust to harsh conditions such as elevated temperatures and hostile organic solvents.218,222−227 Specific examples include B. Matthews’ study of T4 lysozyme, in which 7 rationally designed point mutations were combined stepwise, leading to an increase in the melting temperature by 8.3 °C but a decrease in activity310 and F. X. Schmid’s work on a phagedisplay-based method called PROSIDE.311 3.2. Initial Case Studies of the B-FIT Approach to Protein Thermostabilization

3.2.1. Lipase from B. subtilis (Lip A). The first case-study describing B-FIT concerned thermostabilization of the lipase from B. subtilis (Lip A),76,77 composed of 181 amino acids and previously characterized by X-ray crystallography.312 Ten residues displaying the highest average B-factors were first identified by B-FITTER: Arg33 (B-factor 51); Lys69 (44); Gln164 (41); Asp34 (40); Lys112 (40); Lys35 (39); Met134 (39); Tyr139 (38); Ile157 (37); Gly13 (37). The 10 residues were then grouped into 8 potential randomization sites A, B, C, D, E, F, G, and H (Figure 7) and randomized by NNK-based SM.76 When randomizing the single-residue sites, only a single 96-well microtiter plate sufficed in each experiment for 95% library coverage (see Table 3), but in the case of the 3-residue randomization site B, such coverage would require the screening of 100 000 transformants which was not strived for at the time. Only 3000 transformants were assessed. Fast screening of activity after a defined heat treatment was performed using UV−vis plate reader monitoring the Lip Acatalyzed hydrolysis of p-nitrophenyl caprylate, which releases the yellow UV−vis-active p-nitrophenolate anion. In this initial step, a 15 min heat treatment was applied, enabling determination of T5015 values, the temperature required to reduce the initial enzymatic activity by 50% within this time span. Six of the libraries harbored improved variants (Scheme 2). The two top hits were then re-examined for determination of T5060 values by repeating the heat treatment for 60 min (variant

a

Reprinted with permission from ref 76. Copyright 2006 John Wiley and Sons.

X, T5060 = 89 °C; variant XI, T5060 = 93 °C). For variant XI, ΔΔG‡ proved to be 4.0 kcal/mol.76 In order to gain insight into the possible origin of the increase in apparent robustness, a special deconvolution strategy was applied with generation of a complete fitness pathway landscape featuring all theoretically possible 120 trajectories leading from wild-type Lip A to mutant XI.313 It revealed a number of cooperative mutational effects (more than additive)314 acting between individual mutated residues and sets of residues in all evolutionary pathways.314 In a subsequent theoretical study of Lip A variant XI, molecular dynamics (MD) identified the creation of an extended H-bond system stretching across the O

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

surface of the enzyme as a result of mutagenesis (Figure 8).313 The newly formed intramolecular H bonds were originally

Figure 9. NMR spectra recorded for native and thermally treated 15Nlabeled Lip A mutant XI. (A) 1D 1H spectra. (B−D) 2D [15N,1H]HSQC spectra: (B) Native, (C) recovered after 60 °C treatment, and (D) recovered after 80 °C treatment. Reprinted with permission from ref 316. Copyright 2012 John Wiley and Sons.

precipitation as a result of point mutations.317,318 In terms of practical applications, it is possible that the latter effect may go unnoticed in the absence of additional characterization. In this connection it is interesting to note that optimal mutations in the complementarity-determining regions of antibodies prevent domain aggregation.319 3.2.2. Epoxide Hydrolase from A. niger (ANEH). In another early B-FIT study, the epoxide hydrolase from A. niger (ANEH) was subjected to SM/ISM.320 Previously, the same enzyme served as the experimental platform for developing ISM with the aim of enhancing enantioselectivity in the hydrolytic kinetic resolution of the epoxide 4 from E = 4 to E = 115 in favor of (S)-5 (Scheme 3).321 WT ANEH is a 396-residue enzyme and

Figure 8. Extended network of newly formed H bonds and salt bridges on the surface of Lip A mutant XI as revealed by MD computations. Reprinted with permission from ref 313. Copyright 2009 John Wiley and Sons.

Scheme 3. Model Reaction Used in the B-FIT Study of the Epoxide Hydrolase from A. niger (ANEH).320

thought to prevent defolding upon heat treatment. Related is the milestone study of G. M. Bradbrook and co-workers in which MD computations were harnessed and cross-checked with experimental B-factors, thereby revealing transient hydrogen bonds in protein ligand recognition.315 Whatever the precise mechanism of stabilization may be in this case, resistance to undesired aggregation and precipitation under operating conditions may be involved even if the protein fold remains essentially unchanged. Indeed, subsequent detailed biophysical and biochemical characterization of the best Lip A variants, which included protein NMR spectroscopic studies of variant XI, circular dichroism measurements, X-ray structural analyses, and combining thermal inactivation profiles, led to the discovery of an intriguing surface effect.316 Upon heat treatment, WT Lip A undergoes irreversible aggregation and precipitation while maintaining the complete natural fold, whereas such undesired precipitation of the “robust” variant X does not occur to any significant degree until higher temperatures are applied. The NMR study at room temperature of the thermally treated native WT and evolved variant, each 15N-labeled, revealed that the recovered enzyme has essentially the same natural conformation, but WT and the best mutant appear to have different activities due to different degrees of aggregation/ precipitation.316 After heat treatment at different temperatures, variant XI has a conformation which is almost identical to the heat-untreated enzyme, as shown by peaks in 1D1H and 2D[15N, 1 H] HSQC spectra (Figure 9).316 Using the same lipase and epPCR as the mutagenesis method, N. M. Rao and co-workers evolved a variant which showed the same physical effect, namely, less undesired aggregation/

has a T5015 value of 50 °C (T5060 = 46 °C). Its crystal structure has been determined at 1.8 Å resolution, which shows a dimer with a typical α/ß-hydrolase fold.322,323 The purpose of the BFIT investigation was 2-fold: (1) to enhance thermostability and (2) to explore several evolutionary pathways in order to learn more about the efficacy of ISM in general, especially with the aim of developing strategies for escaping from local minima on the “unconstrained” fitness pathway landscape. The same substrate rac-4 (Scheme 3) was used, as in the earlier study.321 As a convenient pretest for identifying active mutants, the Reymond adrenaline assay was employed.324−327 Hits were then isolated and characterized more precisely for thermostability. First, B-FITTER was used to identify the 50 most flexible amino acids, the top 20 in the form of 10 potential 2-residue randomization sites then being initially considered. In addition, 4 crystallographically unresolved and possibly likewise flexible residues, 321, 322, 326, and 327, which are about 15−20 Å away from the catalytically active Asp192, were also included. In order to keep the experimental SM and screening effort at a reasonable level, 8 of the residues having the highest B-factors as well as the 4 unresolved residues were grouped into 6 randomization sites A−F, each consisting of two amino acid positions: A (residues 321/322) and B (residues 326/327) as crystallographically unresolved sites and C (residues 220/226), D (residues 229/ 230), E (residues 26/27), and F (residues 217/350) as sites displaying highest average B-factors (Figure 10). MechanistiP

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

the neutral or inferior templates led to variants displaying improved thermostability (Scheme 4).320 Scheme 4. Arbitrarily Chosen Portion of the Theoretical ISM Scheme of ANEH Mutants As Catalysts in the Hydrolytic Reaction of Epoxide 4 Featuring Four Evolutionary Pathways Characterized by Steep and Flat Local Minimaa

Figure 10. Location of randomization sites A−F of the epoxide hydrolase from A. niger (ANEH): A (321/322) and B (326/327) as crystallographically unresolved sites and C (220/226), D (229/230), E (26/27), and F (217/350) as sites displaying highest B-factors. Reprinted with permission from ref 320. Copyright 2011 John Wiley and Sons.

cally, the substrate is bound and activated by residues Tyr251 and Tyr314 via H bonds to the epoxide O atom, while Asp192 is the catalytically active nucleophile which attacks the reactive C atom of the epoxide substrate in the rate-determining step.322 All six 2-residue sites were randomized using NDT codon degeneracy encoding 12 amino acids, many leading to notably improved mutants. Then several of the theoretically possible evolutionary pathways were explored in ISM experiments, leading to the best variant GUY-066 in the trajectory WT → D → E → C → F → A→ B with a 21 °C increase in the T5060 value, an 80-fold increase in half-life at 60 °C, and an inactivation energy of 71 kcal/mol (44 kcal/mol increase relative to WT).320 Inactivation energies of the variants were measured by using Arrhenius plots of the rate of thermal inactivation at different temperatures versus the reciprocal of the absolute temperature. Improved variant GUY-066 has the mutations [A321L/S322C][G326C/A327G]-[A220C/S226G]-[S229E]-[E26Y/Q27L][A217C]. Further ISM exploration did not provide superior enzymes, although some of the new variants showed an increase in T5060 by 10−14 °C, 20−30-fold improvement of half-lives at 60 °C, and 15−20 kcal/mol enhancement in inactivation energy.320 In several pathways, a library at a certain point failed to contain improved variants, pointing to a local minimum on the fitness landscape (“dead end”). Such events are known to occur in directed evolution in general, irrespective of the mutagenesis method.274,276−283,309 A counterintuitive, yet successful strategy for escaping from local minima was developed and applied in the ANEH study.320 This was accomplished by picking a nonimproved or even inferior mutant along a given pathway and using it as a template in the subsequent ISM step. As part of curiosity-driven experiments, a neutral variant showing no improvement and even an inferior variant displaying lower thermostability, identified in an earlier study, were also employed as starting templates in an ISM scheme. In all cases

a

In all library constructions, NDT codon degeneracy was used, except when randomizing at site D, in which case NNG (G: guanine) codon degeneracy was applied, encoding the 8 amino acids not covered by NDT. Reprinted with permission from ref 320. Copyright 2011 John Wiley and Sons.

In addition to the B-FIT experiments, alternative strategies were tested for the purpose of comparison. For example, several mutant libraries were created with the aim of introducing salt bridges or removing possible unfavorable electrostatic interactions. Residues Arg36, Ala41, Gln49, Phe 54, Ser58, Leu61, Asn82, Glu104, Gln129, Gln175, and Glu292 were chosen for SM using RRK codon degeneracy (R = adenine/guanine) which encodes for acidic (Asp, Glu) and basic (Arg, Lys) amino acids. Unfortunately, improved variants were not found in these “rational” libraries.320 It is known that the N- and C-termini of protein chains often adopt flexible conformations, and indeed, several examples are known in which shortening the termini and/or anchoring them enhances thermostability.328,329 Therefore, in the same study,320 several SM experiments were performed at residues Lys3 and Glu13 (N-terminus) as well as Asp107, Trp396, Val109, and Val392 (C-terminus) using NDT codon degeneracy. However, improvements were not achieved.320,330 A thorough theoretical analysis of the origin of thermostabilization has not been made to date. Parenthetically, the best thermostabilized ANEH mutants do not show any substantial influence on stereoselectivity. When attempting to combine the B-FIT mutations with those of CAST, satisfactory results were not achieved in this Q

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

3.3.1. B-FIT Supported by PoPMuSiC: Type A Feruloyl Esterase (AuFaeA) from Aspergillus usamii. In a study with potential industrial application, B-FIT was applied to the mesophilic type A feruloyl esterase (AuFaeA) from A. usamii.361 Background of this study was the need for efficient enzymatic conversion of pectinaceous biomasses of the kind potato and sugar beet pulp at high temperatures which ensures lower substrate viscosity, easier mixing, and increased substrate solubility. Enzymes of this kind catalyze the hydrolysis of ester bonds between polysaccharides and phenolic acid compounds in xylan side chains. The complete degradation of xylan requires several hydrolytic enzymes, feruloyl esterase playing a crucial role. This esterase can also be used in the deconstruction of hemicellulose in biofuel production, pharmaceutical synthesis, and the paper, animal feed, and food industries. Hyperthermostable forms are known, e.g., from Thermoanaerobacter tengcongensis, but activity in the desired transformation proved to be somewhat low.362 Upon applying B-FITTER flanked by application of the PoPMuSiC algorithm348 for predicting hot spots for stability design, residues Ser33 and Asn92 were chosen for SM.361 Since a crystal structure of AuFaeA was not available, a reasonable homology model was generated using the MODELER 9.9 program363 based on the X-ray structure of the homologue AnFaeA (PDB code 1UWC) published by P. C. K. Lau and co-workers.362 Then the B-factors of 10 residues of AnFaeA were identified via B-FITTER292 as shown in Figure 11

particular case due to trade-offs either in stability or in enantioselectivity.330 Combining by shuffling could be more effective, which has yet to be tested. It has also been noted that evolvability is favored when starting with stable enzymes or mutants.261,262,265,331 Consequently, in order to evolve truly practical ANEH variants, it would be logical to start with the thermostabilized variant GUY066 as the template when applying CAST/ISM for selectivity/ activity optimization. Alternatively, performing CAST/ISM and B-FIT/ISM simultaneously as demonstrated in the case of limonene epoxide hydrolase may be the superior strategy (see section 3.3.6).303 In summary, a highly thermostabilized variant of ANEH was evolved by the B-FIT method based on semirational design of mutant libraries in an ISM manner. Moreover, it was demonstrated for the first time that in a sequence of mutagenesis steps, in this case ISM, escaping from a local minimum is possible by resorting to a nonimproved or even inferior mutant for continuing the evolutionary upward climb. This technique was later shown to be effective in other ISM systems, indicating that many pathways lead to improved enzyme variants.332 The somewhat surprising phenomenon of utilizing a nonimproved or even an inferior template to escape from a local minimum may be related (but not identical) to M. Eigen’s hypothesis of quasispecies in natural Darwinian evolution,333 as postulated by B. Mannervik in directed evolution studies of glutathione transferase.334−337 It is also related to the phenomenon of neutral drift.338,339 Although experimentally successful,320 the origin of enhanced robustness of ANEH still needs to be elucidated by a theoretical MD-based analysis. 3.3. Further B-FIT Studies Incorporating Additional Techniques

Since the initial reports of the B-FIT strategy,76,77 a number of enzyme thermostabilization studies based on this approach have appeared, most of them being enriched by further techniques such as the consensus approach which had been applied earlier for thermostabilization without explicitly considering Bfactors.340−344 Notable cases of the symbiosis of B-FIT with consensus analysis appeared later.204,345 For example, U. T. Bornscheuer and co-workers, aiming for the thermostabilization of the esterase from Pseudomonas f luorescens, designed a clever sequence of mutagenesis experiments using B-FITTER for identifying residues with high B-factors coupled with consensus data which suggested additional targeting of surface positions.345 Following a final round of ISM, a variant was evolved showing a gain in Tm of 9 °C. The authors stated that all Tm measurements in their study refer to irreversible deactivation. In principle, computational aids such as the Rosetta algorithms29,346,347 can also be used. Some of these techniques can increase the efficacy of B-FIT in directed evolution or the utilization of B-factors in rational design (section 3.1), although exceptions are known (see section 3.3.8). Other computational aids such as PoPMuSiC,348,349 the Damborsky algorithm HotSpot Wizard,350 CUPSAT,351 CAVER,352 and FireProt25 are alternatives, but not all of them have been tested in protein thermostabilization together with B-factor data.25 Sometimes researchers focus SM on presumably flexible sites without considering the actual B-factors of the WT enzyme,353−360 but physically this essentially amounts to B-FIT. In sections 3.3.1−3.3.6, typical studies are presented in which B-FIT was supported by additional computational techniques.

Figure 11. B-factors of residues in homologous enzyme AnFaeA (PDB code 1UWC) as obtained by application of B-FITTER. Reprinted with permission from ref 361. Copyright 2015 Springer Nature.

Table 4. Residues Extracted from Figure 11361 a residue

residue sequence no.

B-factor value

rank

Val Gln Ser Asn Asp Val Asp Asp Glu Pro

261 241 33 92 93 243 124 230 203 32

21 20 20 20 19 17 17 17 16 16

1 2 3 4 5 6 7 8 9 10

a Reprinted with permission from ref 361. Copyright 2015 Springer Nature.

R

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 5. Analysis of Codons for Individual SM at Positions Ser33 and Asn92361 a degenerate codonb

reverse complement codonsb

no. of codons

no. of amino acids

no. of stops

amino acids encoded

colonies for 95% coverage (1 position)c

GNS/BBG TDS/TDC AHR/CAS CVT/AHR MAS/GAS

SNC/CVV SHA/GHA YDT/STG ABG/YDT STK/STC

8/9 6/3 6/2 3/6 4/2

5/8 5/3 4/2 3/4 4/2

0/0 1/0 0/0 0/0 0/0

ADEGV/ARGLPSWV CLFWY/CFY IKMT/QH RHP/IKMT NQHK/DE

22/25 16/7 16/4 7/16 10/4

a

Reprinted with permission from ref 361. Copyright 2015 Springer Nature. bN = A/C/G/T; B = C/G/T; D = A/G/T; S = G/C; H = A/C/T; R = A/G; V = A/C/G; M = A/C; Y = C/T; K = T/G. cNumber of colonies to be screened for 95% coverage (oversampling) when two or three amino acid positions at a given site are randomized using a specific degenerate codon.

(and numerically in Table 4).361 Residues Val261 and Gln241 proved to have the highest B-factors but were rejected by the researchers for different reasons: residue Gln241 is located within 6 Å of the catalytic triad, possibly adversely influencing activity, while Val261 occurs at the terminus of homologous AnFaeA but does not exist in AuFaeA, the actual enzyme under study. Consequently, the two next best suitable residues for BFIT are Ser33 and Asn92, which were considered for SM. PoPMuSiC also allowed the prediction of ΔΔG values. The 3D structures of WT and mutants were visualized by the PyMol software.119 Using the CASTER computer aid, 5 groups of degenerate codons were chosen for SM (Table 5). Employing pPIC9K-AufaeA as starting template, randomization was first performed at residue Ser33 using nonoverlapping oligonucleotides in an improved 2-stage SM process, originally developed for recalcitrant difficult-to-amplify genes.364 The yeast Pichia pastoris served as the host organism. Normally the use of yeasts in directed evolution may not be ideal, but in the case of P. pastoris the SM process had been optimized, as previously illustrated in a directed evolution study for another enzyme (lipase).365 Moreover, an advantage of this expression system is the high purity of the recombinant protein, as specified by the Multi-Copy Pichia Expression Kit, commercially available from the biotech company Invitrogen (USA). In an ISM procedure, the best mutant S33E was then used as a template for SM at residue Asn92. As measured by differential scanning calorimetry (DSC), the best variant, S33E/N92R, showed a melting temperature of Tm = 44.5 °C, which corresponds to an increase of about 5 °C relative to WT AuFaeA.361 At 50 °C, a 4-fold improvement in half-life was demonstrated. Several other variants were also identified in the libraries, although the degree of improvement was somewhat lower. It was not a trivial task to uncover the origin of enhanced thermostability, and in fact, more efforts are necessary for a final conclusion. Suffice it to say that at this point the two residues 33 and 92 occur in a loop which may have been rigidified by the two point mutations. On the practical side, this study constitutes a step forward in developing biocatalysts for the enzymatic degradation of plant biomass at elevated temperatures.361 It should be possible to improve the thermostability even more by an expanded B-FIT/ISM exploration at additional sites not considered thus far. 3.3.2. B-FIT Supported by Consensus Technique and PoPMuSiC: endo-1,4-ß-Galactanase from Talaromyces stipitatus (TSGAL). In a related study likewise focusing on enzymatic conversion of pectinaceous biomasses, which again requires high temperatures, B-FIT also proved to be successful.366 In this case, the 353 amino acid (36.5 kDa) endo-1,4-ß-galactanase from T. stipitatus (TSGAL) was subjected to mutagenesis. Guided by B-FIT analysis of the well-

characterized homologue from Aspergillus aculeatus and by in silico methods including sequence alignment of two thermostable endo-1,4-ß-galactanases by the consensus technique and flanked by PoPMuSiC, 9 residues were chosen as “hot spots”. Instead of performing NNK-based SM systematically at each of these positions, less than the respective theoretically possible 19 new mutants were generated by site-specific mutagenesis. Since the optimal amino acid exchange at each of these positions could not be ascertained with certainty by this semirational approach, several tries were made, leading to a collection of mutants. Several variants resulting from mutations according to the B-FIT analysis were identified as showing distinctly improved thermostability. The best variant proved to be G305A, this position having a high B-factor of 54 in the WT. However, this is not the highest B-factor in the B-FIT analysis nor was it predicted by PoPMuSiC. Rather, structural and entropic considerations led to the idea of focusing on position 305. Variant G305A is characterized by a half-life of 114 min at 55 °C and 15 min at 60 °C. This corresponds to an 8.6-fold increase at 55 °C.366 It was found that the genetically optimized variant G305A, when used in P. pastoris, yields 5.3 g of product on a 2 L scale. This study has generated a great deal of information useful in further improvements by directed evolution studies, which are necessary for broad industrial applications in the future. As a possible strategy, the systematic construction of NNK-based SM libraries at the 9 hot spot residues would provide fingerprint information needed for choosing reduced amino acid alphabets in subsequent SM/ISM. Mutability landscaping could be a crucial aid in such an effort.367,368 3.3.3. B-FIT Supported by Optimal Expression Host: Endoglucanase I from Trichoderma reesei (TrEGI). In yet another B-FIT study by D. S. Clark and co-workers, likewise aiming for potential biomass utilization, the endoglucanase I from T. reesei (TrEGI) was thermostabilized.369 It is a key enzyme for saccharifying cellulosic biomass in the economically viable production of biofuels. Using B-FITTER, 20 residues in the crystal structure of the WT enzyme showing high B-factors were identified. Of these, 12 characterized by highest B-factors were grouped into seven SM sites, A−G, for individual NNKbased SM using overlap extension PCR: A (positions 284−287), B (201−302), C (113, 115), D (238), E (230), F (323), and G (291) (Figure 12). Residues with high B-factors but spatially close to the N- and C-termini, disulfide bridges, or Nglycosylation sites were not considered. The 7 SM libraries were screened leading to the identification of several hits as single mutants. In some cases mutations were combined. The best TrEGI variant, G230A/D113S/D115T, displays a ∼3 °C increase in Tm and a more than 2-fold enhanced half-life at 60 °C relative to WT.369 The authors clearly state that the apparent Tm values refer to irreversible deactivation. A total S

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

then used as templates for ISM at the other individual high Bfactor positions I285, Q289, P372, E440, A443, S444, and K445, which in several cases led to notably thermostabilized variants. Scheme 5 illustrates the general strategy. In contrast to WT Scheme 5. Mutagenesis Strategy in the Thermostabilization of Maize Endosperm ADP-Glucose Pyrophosphorylase (AGPase).370

which shows no activity at 55 °C due to the denaturing effect, triple-mutant Q96G/D161G/A443R has unusually high activity at this elevated temperature, even higher than that of WT at 37 °C.370 Several single mutants also showed improved characteristics. An unexpected finding of notable significance is the observation that some of the thermostabilized variants influence the allosteric properties of the enzyme which had been identified earlier. It was noted that especially mutation D161G links heat stability with allosteric regulation, similar to variant T142F generated by protein engineering in a previous study. Both thermostabilized single mutants show decreased Ka for positive allosteric activator 3-PGA and significant catalytic activity in the absence of 3-PGA.370 It would be interesting to see how ISM would perform if multiresidue randomization sites were to be defined by appropriate grouping of the 9 previous residues. This would increase the probability of cooperative effects, i.e., more than just additive influences.314 3.3.5. One-Step Combined B-FIT SM and Focused epPCR As Applied to a Cold-Active Xylanase. As already noted, epPCR has been applied to proteins a number of times for enhancing thermostability.200−205 While B-FIT constitutes a more rational and particularly viable alternative approach, a given mutant evolved by this technique can in principle be subjected to epPCR for further improvement by possibly adding additional stabilizing mutations. The reverse procedure, first applying epPCR then B-FIT, is also possible. An alternative, and perhaps more effective, is the recently developed one-step procedure in which B-FIT is combined with focused epPCR,374 this time in a single-mutagenesis event (Figure 13).375 The

Figure 12. B-FIT-guided choice of randomization sites A−G in the endoglucanase I from T. reesei (TrEGI). Reprinted with permission from ref 369. Copyright 2015 Springer Nature.

of ∼11 000 transformants were screened. Amino acid exchange events at G230 in combination with D113S/D115T led to stabilized variants with enhanced activity as an added feature. Interestingly, these properties varied according to the expression system, E. coli and S. cerevisae failing to provide acceptable results. In contrast, TrEGI variants expressed in N. crassa or T. reesei led to the most stable and active catalysts. This study illustrates that directed evolution alone with creation of a viable enzyme may not solve all practical problems. It should be noted that with the exception of SM at single-residue sites D, E, F, and G, library coverage was very low, which means that superior mutants may have been missed. With the application of more recent improvements of B-FIT/ISM such as the use of rationally chosen reduced amino acid alphabets or computational aids such as PoPMuSiC, even better results may be possible. 3.3.4. B-FIT Supported by Combining Mutations Followed by ISM: Maize Endosperm ADP-Glucose Pyrophosphorylase (AGPase). Intense interest also exists in enhancing heat stability as well as kinetic parameters of the maize endosperm ADP-glucose pyrophosphorylase (AGPase), an essential enzyme in the starch biosynthesis pathway.370 Since heat lability of this AGPase is believed to be linked to grain loss during hot weather, especially in many underdeveloped countries, any notable increase in thermostability can be expected to have a significant agronomic impact.371,372 In order to solve this challenging problem, B-FIT was used in combination with ISM,370 as in the original proof-of-principle study.76,307,308 However, rather than using multiresidue randomization sites, SM/ISM was restricted to 9 individual amino acid positions. Since the crystal structure of AGPase was not available, that of the closely related potato homologue was used in the B-FITTER analysis (small subunit homotetramer, PDB code 1YP2).373 Due to the evolutionary relationship and the similar amino acid sequence, it was assumed that this information is adequate for identifying hot spots for the B-FIT/ ISM-based optimization of AGPase. The 9 libraries generated by NNK-based SM were screened for 95% coverage (one 96-well microtiter plate in each case). Eight single mutants displaying higher stability were identified, and some of the best ones were combined with formation of two double variants Q96G/D161G and Q96R/D161G. These were

Figure 13. One-step mutagenesis method combining sauturation mutagenesis (SM) with epPCR. Reprinted with permission from ref 375. Copyright 2017 Elsevier.

primary purpose of this proof-of-concept study was to show that the respective molecular biology works well and not so much the degree of thermostabilization of a newly discovered cold-active xylanase used as the model system. Residues with high B-factors were identified on the basis of high RMSD values obtained by MD computations and by B-FITTER; for random epPCR-based T

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

limonene epoxide hydrolase (LEH) was used as the model system (Scheme 6).303 It is the same experimental platform that

mutagenesis, a defined structurally disordered region was chosen. Several combinatorial libraries of this novel type of mutagenesis were generated, the SM part being based on rationally chosen reduced amino acid alphabets. Several improved variants were evolved by this concerted technique, the increase in thermostability being modest (ΔT5015 from 4.1 to 4.3 °C).375 One of the hits proved to be a triple-mutant N226L/ C262S/I264S with point mutation N226L being a consequence of B-FIT based randomization and both C262S and I264S resulting from focused epPCR. It was noted that a sequential technique as opposed to the simultaneous mutagenesis practiced in this study would not have resulted in this particular triple mutant.375 A trade-off in activity was not observed.375 In future work aimed at further improvements, the one-step method needs to be applied in ISM-based exploration. Iterative epPCR in which randomization is restricted to defined regions is another alternative, although labor intensive when screening is the bottleneck. 3.3.6. Simultaneous Multiparameter Directed Evolution for Enhanced Thermostability and Stereoselectivity with Maintained Activity. Protein engineering of robustness toward heat treatment or hostile organic solvents based on rational design or directed evolution generally results in mutations on the enzyme’s surface, remote from the active site. As already noted in section 3.1, a number of studies report little or no trade-off in activity.201−205,226 However, when attempting to optimize two parameters simultaneously such as activity and stability or stereoselectivity and activity, problems generally arise. Three-parameter optimization in directed evolution of enzymes aimed at enhancing thermostability, stereoselectivity, and activity is even more challenging. Point mutations can improve one parameter but exert a deleterious influence on the other one. This is an unfortunate but not surprising phenomenon which pervades most of protein engineering.225,274,276−283,309,376−378 In nature, this phenomenon is also known. For example, a notable trade-off between stability and activity was noted in the adaptive evolution of the CO2-transforming enzyme Rubisco.379 Relevant to the general problem of genetic multiparameter optimization in the laboratory is an often forgotten advice: In the case of stereoselectivity and activity as the two parameters being genetically optimized simultaneously, for example, it is not necessarily optimal to choose the very best mutant showing highest improvement in stereoselectivity (or vice versa activity) in a subsequent mutagenesis experiment. Rather, it is advisible to make compromises regarding both parameters by choosing mutants as templates for the next step which show reasonable but not optimal improvements in both parameters.380 When attempting to optimize thermostability and stereoselectivity, three different strategies are possible: (1) Optimize thermostability first and then employ the best hit as template for addressing stereoselectivity; (2) optimize stereoselectivity first and then thermostability; (3) optimize both parameters simultaneously with the advantage that only one project needs to be carried out. However, acceptable levels of activity must also be ensured by defining a threshold to be used in screening. Whereas the first two strategies have been implemented numerous times, strategy 3 was not tested until recently by performing ISM at B-factor-indicated positions known to be sensitive to thermostabilization and to CAST sites in a simultaneous manner.303 In this study, the hydrolytic desymmetrization of cyclohexene oxide (6) with formation of (R,R)- and (S,S)-7, catalyzed by

Scheme 6. Model Reaction Catalyzed by Limonene Epoxide Hydrolase (LEH)303

had been employed in earlier CAST/ISM studies aimed at enhancing and reversing only enantioselectivity.299,300,302,381,382 WT LEH shows poor enantioselectivity with very slight preference for the formation of (S,S)-7 (2−4% ee). Its crystal structure has been analyzed.383 In an earlier report relevant to this new LEH project, LEH had been optimized for higher thermostability by D. B. Janssen’s computational method dubbed Framework for Rapid Enzyme Stabilization by Computational libraries (FRESCO).384 In what appears to be a general approach to thermostabilization,385 three different computational guides need to be considered when applying FRESCO: Rosetta ddg,386 FoldX,387,388 and a new algorithm called DDD for disulfide-bridge discovery. On the basis of an analysis of a large number of previous FoldX-based studies, it was recently concluded that this computational tool is only reliable in predicting stabilizing mutations when the quality of the crystal structure is distinctly high.388 In the case of an LEH mutant with 12 point mutations resulting from FRESCO, a dramatic increase in thermostability was achieved relative to WT, amounting to ΔTm = 35 °C with very little trade-off in activity and no change in stereoselectivity.384 The ΔTm value refers to apparent Tm values measured under irreversible deactivation conditions. In a subsequent X-ray structural study it was shown that essentially all of the 12 stabilizing mutations occur either on or near the surface of LEH, but not all of them correspond to positions of notably high B-factors in the WT; moreover, the use of FoldX alone would have led only to the introduction of aromatic amino acids on the protein surface.389 In the multiparameter study (Scheme 6),303 the plan was to apply SM at two types of LEH residues (Figure 14): Those at which mutations should influence thermostability as reported in the FRESCO thermostabilization study, especially the 8 positions at residues S15, A19, E45, T76, T85, N92, Y96, and E124,386 which includes positions of high B-factors (especially 15 and 92), as well as CAST positions (M32, L74, M78, I80, L114, I116, F139, and L147), which can be expected to influence stereoselectivity as reported by the earlier CAST/ISM study.300,302,381,382 J.-L. Reymond’s adrenaline on-plate assay324−327 was used to identify active hits in the reaction of epoxide 6. Following heat treatment, the best ones were then tested by chiral GC for enantioselectivity. About 30−40 microtiter plates of 96-format could be assayed per day. It should be emphasized that in this study the FRESCO hot spots were used. This means that an equally possible alternative B-FIT/CAST approach was not strictly followed. Indeed, the BFITTER analysis shows that although some of the residues have in fact high B-factors the most flexible residues were actually not targeted (Table 6). The 16 amino acid positions were then grouped into 4 randomization sites, A, B, C, and D, so that in each case “thermostability” and “enantioselectivity” residues were combined. For each combinatorial SM, a rationally chosen triple U

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

favored in order to minimize screening for 95% library coverage (∼770 transformants), as opposed to NNK-based SM (∼3 × 106 transformants). When SM was applied at each of the four randomization sites, two different triple codes were applied simultaneously at different amino acid positions according to strategy 2 in Scheme 1 (section 3.1): One for the “thermostability” residues and one for the “enantioselectivity” residues in simultaneous SM experiments, namely Lys-Pro-Asp (K−P-D) and Val-Phe-Tyr (V−F−Y), respectively. These decisions were made on the basis of structural considerations and previous results in which thermostability384,386 and stereoselectivity303 had been evolved separately by two different research groups. Following the respective SM experiments and expression, the supernatants of the lysates were placed in the wells of 96-format microtiter plates and heated in an oven at 70 °C for 20 min. These conditions led to 90% activity loss of WT LEH, which also means that the less stabilized mutants are eliminated. The eliminated variants could possibly show enhanced enantioselectivity, but this was not of immediate interest and was therefore not checked. A 3-phase screening system was devised: (1) Crude on-plate UV−vis pretest for activity, (2) automated GC analysis for enantioselectivity in the (R,R)- and (S,S)manifolds separately, and (3) automated GC analysis for conversion of the otherwise best hits (which requires longer elution times). SM at sites C and D did not lead to notably improved variants with respect to all three parameters, in contrast to the initial libraries A and B which harbored several distinctly improved “all-round” variants. Thereafter, several ISM evolutionary pathways were then explored in a similar manner, although no effort was made to escape from local minima in relevant cases. A number of variants showing enhanced thermostability, enantioselectivity, and even activity were identified. Choosing the “universally” best variant(s) depends upon how the researcher judges the relative importance of the three parameters, compromises being necessary as stated above. In this particular study, thermostability and enantioselectivity were considered to be most important provided the trade-off in activity was small, which was in fact found to some extent in all mutants. Enhancement as well as reversal of stereoselectivity in the desymmetrization of epoxide 6 were considered to be essential for practical reasons. The ISM steps reflecting distinctly improved and reversed enantioselectivity are assembled in Figure 15. Variant H-2-H5 (T76K/L114V/I116V/N92K/F139V/L147F/S15D/A19K/ L74F/M78F/E45D) with improved (S,S)-selectivity (94% ee) is characterized by a T5030 value of 51 °C, an increase of 10 °C. Variant J-7-A12 (S15P/M78F/N92K/F139V/T76K/T85K/ E45D/I80V/E124D) with reversed (R,R)-selectivity (80% ee) has a T5030 value of 46 °C, an increase of only 5 °C. The trade-off in activity of these variants is very small as shown by the kinetic parameters of the isolated enzymes. WT LEH [kcat = 0.82 s−1; Km = 6.7 mM; kcat/Km = 122 s−1 M−1] versus variant H-2-H5 [kcat = 0.93 s−1; Km = 13.1 mM; kcat/Km = 71 s−1 M−1] and variant J-7A12 [kcat = 0.84 s−1; Km = 7 mM; kcat/Km = 120.7 s−1 M−1].303 In summary, these are remarkable results which indicate that simultaneous multiparameter genetic optimization is indeed possible while keeping trade-off effects at a minimum, made possible by conducting mutagenesis under stringent conditions with concomitant consideration of thermostability, enantioselectivity, and activity.303 Nevertheless, the results are not perfect, which illustrates once more that more research is necessary in the area of multiparameter-directed evolution. This is all the more important due to another interesting aspect of this study.

Figure 14. LEH residues chosen for SM as marked in the WT crystal structure for simultaneous genetic optimization of thermostability and enantioselectivity/activity. Green residues are positions at which mutational exchanges can be expected to influence thermostability. Red residues mark CAST positions at which mutational changes should influence enantioselectivity and activity. Reprinted with permission from ref 303. Copyright 2016 American Chemical Society.

Table 6. B-Factor Values of 8 Selected Residues As Revealed by B-FITTER on the Basis of the Crystal Structure of WT LEHa entry

residue

B-value

rank

1 2 3 4 5 6 7 8

Glu124 Ser15 Asn92 Glu45 Ala19 Thr85 Thr76 Tyr96

27 27 24 23 19 14 12 10

5 6 13 17 37 83 104 124

a

The eight residues emerged from the FRESCO study,384 in which Bfactors were not reported. The authors of the present review derived these B-factors in order to enrich the analysis.

code was chosen, encoding a 3-membered reduced amino acid alphabet (Table 7). Triple-code saturation mutagenesis (TCSM),300,301,390 involving three rationally chosen combinatorial building blocks as the reduced amino acid alphabet was Table 7. Grouping of the 16 Selected Residues of LEH into 4 Randomization Sites for SM Using Triple-Code Saturation Mutagenesis (TCSM)303a

Green residues mark “thermostability” positions; red ones mark “enantioselectivity” counterparts. Reprinted with permission from ref 303. Copyright 2016 American Chemical Society.

a

V

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Pectobacterium carotovorum upon exchanging amino acids within 6 Å of the bound substrate individually by alanine is accompanied by an increase in thermostability, an observation that was difficult to explain on a molecular level.391 3.3.7. B-FIT as a Means to Increase Robustness toward Hostile Solvents. In continuation of the Lip A study,76,77 the originally evolved variants were shown to tolerate hostile organic solvents such as acetonitrile, dimethyl sulfoxide, and dimethylformamide more so than the WT (Figure 16).392 In this case a correlation exists between increased thermal robustness and enhanced tolerance to organic solvents, an effect that is often but not always observed.200−204 Additional mutagenesis was not necessary in this case, but such efforts could in principle increase the solvent tolerance of the lipase Lip A even more.

Figure 15. Mapping the ISM-based evolution of enantioselectivity in the multiparameter optimization of LEH as catalyst in the hydrolytic desymmetrization of cyclohexene oxide (6). Two best variants H-2-H5 and J-7-A12 are characterized by notable increases in thermostability amounting ΔT5030 = 10 and 5 °C, respectively. Reprinted with permission from ref 303. Copyright 2016 American Chemical Society.

In an alternative and more traditionally oriented strategy performed for comparison reasons, combining point mutations of the most highly thermostabilized variant from the FRESCO study384,386 with those of the most stereoselective variant(s) evolved earlier by CAST/ISM300,303 failed to deliver any significantly improved LEH mutants.303 This was mainly due to notable trade-off effects. MD computations and a crystal structure of variant H-2-H5 formed the basis for shedding light on the origin of improved enzyme characteristics,303 although more detailed theoretical analyses are necessary. Finally, the use of B-FIT at residues displaying the truly highest B-factors (Table 6) in combination with CAST/ISM may provide superior results in future studies. Irrespective of possible future improvements regarding simultaneous multiparameter genetic optimization, this study needs to be compared to another recent investigation in which thermostability, enantioselectivity, and activity were handled in a different manner.306 The authors applied CAST and B-FIT separately using the (+)-γ-lactamase from Microbacterium hydrocarbonoxydans as catalyst in the hydrolytic kinetic resolution of the so-called Vince lactam (2-azabicyclo[2.2.1]hept-5-en-3-one), an important intermediate needed in the production of several chiral pharmaceuticals. Following separate optimization, the respective best point mutations originating from CAST and B-FIT were combined, leading to half a dozen variants with differently improved T5015 values ranging between 48 and 72 °C (WT 41 °C) and selectivity factors E ranging between 178 and >200 at various activities. This allowed compromises to be made in terms of trade-offs.306 Since different enzymes and substrates were employed in the two studies, it is difficult to compare them directly. In contrast to the above programmed optimization of thermostability and stereoselectivity, cases are known in which thermostabilization also results in some degree of activity enhancement,384 although this appears to be a fortuitous unplanned result. Similarly, it has been observed that the programmed increase in activity of the nitrile reductase from

Figure 16. Half-lives of WT Lip A and of previously evolved variants in a mixture of water and acetonitrile (ACN), dimethyl sulfoxide (DMSO), and dimethylformamide (DMF), 50% by v/v, at 30 °C, as measured by the residual activity in the hydrolysis of p-nitrophenyl caprylate. Reprinted with permission from ref 392. Copyright 2010 Royal Society of Chemistry. W

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

In an effort to engineer a methanol-tolerant variant of the lipase from Thermomyces lanuginosus (TLL) for application in biodiesel production, Z. Li and co-workers successfully applied B-FIT/ISM.393 Biodiesel can be produced from renewable sources such as vegetable oils which are trans-esterified with methanol under basic conditions.394 Since these oils are needed as food for human beings, ethical aspects have emerged. In contrast, low-cost waste grease from many different sources (alone in the United States 1.5 million tons/year) is unsuitable for human consumption and contains 20−40% fatty acids.393−397 In the B-FIT/ISM study, the crystal structure of TLL (4ZGB), which is a homodimer of two identical subunit 269 amino acid protein chains, was used for identifying 8 residues having the highest B-factors (Table 8).393

Table 9. Composition of Small Focused SM Libraries of the Alcohol Dehydrogenase PedE from Pseudomonas putida KT2440398a position

Table 8. Amino Acid Residues in the Lipase from Thermomyces lanuginosus (TLL) Having the Highest BFactors As Derived from B-FITTER393 a amino acid

residue no.

B-factor (Å2)b

aspartic acid valine serine glycine arginine aspartic acid asparagine proline

D27 V60 S105 G59 R118 D102 N101 P29

84 73 71 70 69 68 67 67

WT sequence B-factor

91

R

39

307

D

63

310

N

63

352

K

41

408

Q

46

410

N

39

library composition A•, D•, Q•, E•, K• E•, I•, K•, T•, V• R•, E•, Q•, K•, T• A•, R•, D•, Q•, W• G•, K•, M•, P•, Y• A•, Q•, K•, S•, T•

degenerate codons

screened clones

VAG, GSC

36

RHG, ATC

30

VMG, CGC

30

GMM, CGC, TGG MHG, KRC

42 84

NCG, RAM

25

a

Black dots refer to the most abundant amino acids at the corresponding residue in a subset of the 3DM data base that harbored only pyrroloquinoline quinone-dependent alcohol dehydrogenases (PQQ-ADHs). Red dots denote the amino acids that were found to occur frequently in a subset of the 3DM data base which contains only sequences of proteins derived from extremophilic organisms. Reprinted with permission from ref 398. Copyright 2017 John Wiley and Sons. • = PQQ-ADH data set; • = Extremophile data set.

As a consequence of this semirational procedure, a total of only 200 clones had to be screened for robustness, which ensures 95% library coverage. Mutations at some, but not all positions led to stability improvement. Appropriate point mutations were then combined and tested for further improvements. Increase in relative residual activity was used to assess the degree of thermostabilization. The best single variant E408P showed a 2.3-fold thermostability enhancement, while the R91D/E408P double mutant (3.2-fold improvement) and the triple mutant R91D/E408P/N410 K (4.0-fold improvement) led to better results, these data being based on a heat treatment for 1 h at 45 °C. It was stated that the triple mutant exhibits a 7 °C increase in stability, which was found to correlate with enhanced resistance to dimethyl sulfoxide (50% in water), namely a 2-fold increase in residual activity upon incubation in this hostile solvent.398 Kinetics revealed that the triple mutant has only a slight decrease in Vmax relative to WT, while the kcat/Km values are essentially identical. 3.3.8. Rare Case of Comparing Different Protein Engineering Strategies for Stabilizing an Enzyme. In addition to B-FIT/ISM, many different techniques and strategies for engineering enhanced thermostability using Bfactor-based mutational design have been developed (Table 2), a fortunate state of affairs but one that makes it difficult for researchers to choose the truly best approach. In some of the previously highlighted studies, a limited yet insightful effort was made to compare strategies. Recently, a study appeared in which five different protein engineering strategies for stabilizing an α/ ß-hydrolase fold enzyme were compared by testing experimentally the performance of (1) computational predictions using Rosetta-Design and FoldX, (2) B-factors, (3) proline introductions, (4) consensus data, and (5) epPCR induced random mutagenesis.241 In the screening step, heat treatment and the chemical denaturing agent urea were used. It should be noted that in the mutagenesis experiments only one-time single amino acid exchanges (single substitutions) were considered, which means that a direct comparison with B-FIT/ISM is not possible in which the use of multiresidue randomization sites

a

Reprinted with permission from ref 393. Copyright 2017 Elesevier. These B- factors are the average values in two subunits of the TLL homodimer.

b

All 8 residues were subjected to NNK-based SM, in each case two 96-well microtiter plates being screened to ensure >98% library coverage. Then a limited amount of ISM was performed using the best mutant as template, leading to 7 distinctly improved variants, the best one being the double mutant S105C/D27R. It retained 71% of its original activity after being incubated in methanol.393 Upon immobilizing variant S105C/ D27R, it showed excellent performance in four consecutive rounds of application using recycled biocatalyst for converting waste grease. The authors speculate that a conformational change similar to interfacial activation may be involved.393 In yet another study, B-FIT and consensus data were employed in a combined effort to engineer thermostability and solvent tolerance of the pyrrolo-quinoline quinonedependent alcohol dehydrogenase PedE from Pseudomonas putida KT2440.398 The authors utilized the so-called 3DM platform developed by R. K. Kuipers and co-workers,399 which automatically builds an entire molecular class-specific information system (MCSIS) based on multiple sequence alignment (MSA) and multiple structure alignment. Since the crystal structure of this enzyme was not available, the positions of high B-values were obtained with the help of B-FITTER, specifically from the X-ray structure of the homologue PQQ-ADH ExaA of P. aeruginosa which has 88% sequence identity. Residues with the highest B-factor values, identified in a 3DM database of 12 000 sequences, were chosen for SM: D307, N310, E408, K352, N410, and R91. Guided by consensus data, codon degeneracies corresponding to drastically reduced amino acid alphabets were designed for mutagenesis (Table 9).398 It should be noted that this is an example of “strategy 2” (Scheme 1, section 3.1). X

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

works best. A plant esterase, namely, salicyclic acid binding protein 2 (SABP 2), was chosen for the comparative study. Only the primary conclusions are reiterated here: All of the approaches led to the discovery of stabilizing point mutations, but the degree of success and the amount of effort varied considerably, as the authors note: (1) The “computational methods performed poorly” (Rosetta/FoldX), several reasons being postulated; (2) SM at residues suggested by the consensus approach gave the best results; (3) B-FIT also ensured notable improvements; (4) the influence of combining single mutations proved to be unpredictable and in many cases did not enhance stability.241 SM at high B-factor single residues led to improvements not as good as those of the consensus strategy. However, the experiments do not constitute a test for B-FIT/ ISM because, as already stated, only single amino acid exchanges were considered, whereas combinatorial approaches in which more than one residue is simultaneously randomized have been shown to be superior, especially if ISM is applied.76,77,287,288,298,300,381,382 Despite these constraints, the comparative study is informative because it provides valuable insight into the nature of the respective stabilizing mutations discovered in each approach. The most important results are summarized in Table 10. 3.3.9. Active Center Stabilization (ACS) Guided by Local B-Factors near Binding Pocket. As noted above, enhancement of thermostability using the B-FIT approach led to widespread successful applications. However, in some cases the analysis of larger more complex enzymes yielded inconclusive results. For example, when Kim et al. mutated the seven residues with the highest B-factors on the surface of C. antarctica lipase B (CalB), which is composed of 317 residues, they obtained a mutant that increased the t1/2 at 50 °C by only 24%.236 Thus, simply reducing fluctuations of these residues may be insufficient for protecting the enzyme’s active center from heat-induced conformational changes leading to possible loss of activity. Considering that the active site is a crucial structural part for enzyme function, optimizing the rigidity closer to the active site may be an alternative method for enhancing enzyme thermostability. Keeping this in mind, Y. Feng and co-workers developed an approach for efficiently generating smart libraries for enhanced enzyme stability based on B-factor analysis but this time with emphasis on the region near the active site. The authors believe that kinetic stability is involved, but this still needs to be studied more closely before a conclusion can be made. The method called active center stabilization (ACS) for enhancing the stability of enzymes (Figure 17) extends conventional B-FIT.240,402 In a certain sense, the active site can be considered to be a fragile part of the enzyme as a whole, specifically when considering the relationship between enzyme conformation and activity during denaturation.226 Therefore, optimizing the rigidity of the active site may help distinctly in maintaining the enzyme’s correct conformation needed for functional catalysis which would otherwise be lost upon heat treatment. Of course, too much rigidification must be avoided, because sufficient flexibility is generally required for high activity at ambient temperature.403 This approach is related to but not identical to the multiparameter method in which thermostability, stereoselectivity, and activity are simultaneously optimized by B-FIT/CAST (section 3.3.6).303 In several proof-of-concept studies, residues with highest Bfactor values near the binding pocket were selected for SM.240,402,404 Following the procedure outlined in Figure 17,

Table 10. Single Amino Acid Substitutions That Stabilize the Esterase SABP2 to Heat Treatment or Urea Denaturationf

a

Consensus indicates replacement by the most common residue found in homologues. Proline theory indicates introduction of a proline in locations found in a stable homologue. Random mutagenesis indicates substitutions identified by epPCR followed by screening for increased stability. Computation indicates predicted stabilizing substitutions using both RosettaDesign and FoldX. B-factor indicates substitutions in highly flexible regions as identified by Bfactor data in the crystal structure. Literature indicates the natural stabilizing variant in Manihot esculenta HNL.400 bEnzymatic efficiency (kcat) toward hydrolysis of p-nitrophenyl acetate as measured by release of p-nitrophenol at 404 nm (ε = 16 500 M−1 cm−1) in 10 vol % acetonitrile (pH 7.2) in 4.5 mM BES buffer. cIrreversible unfolding calculated by comparison of half-lives at 60 °C, derived from measurement of residual esterase activity after incubation for 15 min at 60 °C, assayed as described above. Mutations that increased halflives at least 25% were counted as stabilizing. dUnfolding in varied concentrations of urea measured after incubation for 24 h. Incubated in 5 mM BES (pH 7.2) with protein concentrations of 0.1−0.3 mg/ mL and urea concentrations of 0−6.2 M. Unfolding detected by loss of inherent tryptophan fluorescence at 329 nm measured with excitation at 278 nm. Wild-type SABP2 showed a [urea]1/2 value of 2.2 M urea. Mutations that increased [urea]1/2 by at least 0.5 M were counted as stabilizing. eMutations previously identified in ref 401. f Reprinted with permission from ref 241. Copyright 2017 American Chemical Society.

Figure 17. Flowchart of the ACS strategy for improving enzyme stability. Reprinted with permission from ref 402. Copyright 2016 Springer Nature.

two enzymes were engineered as models to demonstrate enhanced stability by the ACS strategy, namely, C. antarctica lipase B (CalB)404 (section 3.3.9.1) and Candida rugosa lipase 1 Y

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(Lip1)402 (section 3.3.9.2). These two enzymes were chosen because, as α/β hydrolases, they have a larger size and a more complex structure relative to, e.g., Lip A with only 181 amino acids. This and other comparisons are pictured in Figure 18.

Figure 19. ACS-based thermostabiization of CalB.404 (A) ISM-based thermostabilty diagram of CalB. (B) Relative B-factor profiles for WT (black) and variant D223G/L278 M (magenta). Reprinted with permission from ref 404. Copyright 2014 American Society for Biochemistry and Molecular Biology.

Figure 18. Topological structures of monomeric globular lipases and sites chosen for SM of Lip A (A) in an earlier B-FIT study,76 CalB (B),404 and Lip1 (C).402 α-Helices and β-strands are shown as cylinders and arrows, respectively. Canonical α/β-hydrolase folds are in red. All serine catalytic residues are represented as red sticks. Sites in LipA previously used for SM are color-labeled labeled (upper right). Red ring represents the range of 10 Å around the catalytic residue serine. In Lip1 (C), two structural models were based on the open state (1CRL) (left) and closed state (1TRH) (right) of the Lip1 X-ray structures while the lid is colored in olive. Reprinted with permission from refs 240 and 402. Copyright 2016 Springer Nature.

network with seven structurally coupled residues within the relatively flexible α10 helix that are primarily involved in forming the active site. It was postulated that enhanced rigidity in the segment around the active site helps the enzyme to maintain its correct conformation at high temperature while preventing exposure of the active site.404 3.3.9.2. ACS Using Lipase 1 from C. rugosa (Lip1). In order to extend the applicability of the ACS strategy, a clearly larger lipase C. rugosa lipase 1 (Lip1) having 534 residues was selected.402 In addition to a canonical core with the typical α/β hydrolase fold domain, Lip1 possesses 10 helixes and 5 strands around the core structure as well as a lid segment covering its catalytic triad.408 The two different crystal structures of Lip1,409 corresponding to its “open” and “closed” states, respectively, were analyzed by B-factors using B-FITTER.76,77 The 18 residues corresponding to the highest B-factors around the active center were chosen for SM as shown in Figure 18C. Five mutants showed improved thermostability (Figure 20A). To efficiently accumulate optimal mutations, ordered recombination mutagenesis (ORM) was performed, which combined all of the beneficial point mutants ordered from the best single mutation to the next site that showed lower improvement. Two paths for ORM were initiated for the F344 M and F344I mutants, which finally resulted in two additional improved mutants, VarA3 (F344M/F434Y/F133Y/F121Y) and VarB3 (F344I/F434Y/F133Y/F121Y). These two mutants represent remarkable increases in T5015 of 12.6 and 13 °C over WT, respectively (Figure 20A). Moreover, the optimal operating temperature of VarB3 increased from 45 to 60 °C, and the halflives were improved by 40-fold compared to WT.402 For the purpose of comparison, the relative rigidity of the active center of the WT Lip1 and best mutant VarB3 was assessed by the use of the PredyFlexy web server410 in addition to MD computations and normalized B-factor analysis. Although in most aspects the modeled structure of VarB3 proved to be similar to that of the WT, the intramolecular interactions around the mutated sites exhibited subtle variations.

3.3.9.1. ACS Using Lipase B from C. antarctica (CalB). C. antarctica lipase B (CalB), an efficient catalyst for many reactions including stereoselective transformations and polyester synthesis,405−407 was used as the enzyme in the first ACS study.404 The B-factor profile is based on two criteria: Location within a 10 Å radius around the catalytically active Ser105 residue and having the highest B-factor of all residues around the active site. The six residues located within 10 Å of the catalytically active Ser105 which showed the highest B-factors were chosen for SM as shown in Figure 18B. Mutant L278 M with both high activity and improved thermostability at 50 °C was selected as a template for ISM at the other five sites. The best mutant, D223G/L278M, exhibited a 13-fold increase in half-life at 48 °C and a 12 °C higher T5015 (Figure 19A). For comparison, residues located on the surface of CalB with highest B-factors were also investigated using a similar method. Seven residues with highest B-factors were chosen for SM. However, no obvious thermostable variants were found in any of the libraries examined under the same screening conditions. Further characterization of positive mutants showed that global unfolding resistance against both thermal and chemical denaturation was also improved. Moreover, the catalytic properties of all variants, except D223G, showed an increase in both kcat and Km compared with WT.404 To elucidate the structural basis for improved stability, X-ray crystallography and B-factor analysis were performed (Figure 19B). Compared with WT CalB, no obvious increase in interactions was observed in the mutant structures. The most discernible difference was found in the behavior of the α10 helix. Close inspection of the X-ray structure reveals that the D223G/ L278 M mutant forms an extra main chain hydrogen-bond Z

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 21. Correlations between the distances of the mutated residues to the catalytic residue serine, relative B-factors, and the thermostabilities T5015 of the lipases.402 Lip A (181 AA),76 CalB (317 AA),404 and Lip1 (534 AA):402 data analysis is based on their mutational results. Reprinted with permission from refs 240 and 402. Copyright 2016 Springer Nature.

stability is involved. In addition to such directed evolution studies, attempts to enhance protein stability by rational design using site-specific mutagenesis at hot spots near the active center may also be a rewarding approach. 3.3.10. Compilation of Selected B-FIT Studies. In this section, typical B-FIT studies are compiled with informative comments (Table 11). As in the other tables, representative examples were chosen rather than listing all published studies. 3.3.11. Going the Other Way: Directed Evolution of Protein Lability on the Basis of Low B-Factors. Only one study utilizing B-factors for introducing lability into proteins has appeared.416 Whereas such an endeavor seems to be contraproductive, increasing thermolability while maintaining or even enhancing catalytic efficiency may in fact have special applications in biotechnology, in addition to possibly providing interesting mechanistic information. In one type of potential application involving a complex mixture of enzymes required for multistep transformations one of the components, following completion of its enzymatic function, may need to be “eliminated” by thermal deactivation. In this way, it does not interfere with any useful follow-up transformations. A related scenario concerns heat-labile alkaline phosphatases as catalysts for certain functions in molecular biology, their efficient inactivation by a short and mild heat treatment being necessary to avoid interference with the end labeling by a polynucleotide kinase.418 Yet another potential application concerns certain glycosidases used in the baking industry which generally need to be labile enough so that they are deactivated during the cooking process; any survival of enzyme activity thereafter may adversely alter the quality of the final stored food product.418 In a proof-of-principle study, the thermostability of the lipase from P. aeruginosa (PAL) was lowered in a controlled manner resulting in a reduction of T5015 by 30 °C without changing the catalytic profile.416 This was achieved by first applying BFITTER for the identification of residues with low average Bfactors indicating rigidity. In this way, 7 residues were identified (Figure 22): Gly200 (B-factor = 19), Gly106 (20), Gly80 (20), Ser104 (20), Tyr195 (21), Ile79 (21), and Val193 (21). These were assigned to single-residue randomization sites A, B, C, D, E, F, and G, respectively. It is interesting to note that three of these refer to glycine, which is normally considered to be a flexibilityfavoring amino acid. Irrespective of this “contradiction”, the authors proceeded in the designed manner. Following extracellular expression and centrifugation, the supernatants were subjected to heat shocks at 60 °C for 15 min in a thermocycler. It was stated that a “good” thermolabile

Figure 20. ACS-based thermostabiization of Lip1.402 (A) Thermostability expressed as T5015 values of all mutants in the study of Lip1. (B) Prediction of normalized B-factors of WT (black) and the VarB3 mutant (red) of Lip1. Reprinted with permission from refs 240 and 402. Copyright 2016 Springer Nature.

Furthermore, the normalized B-factor predictions showed that the residues around the mutated positions have lower B-factor values compared with WT (Figure 20B), suggesting decreased flexibility. It was found that mutated positions at 121, 133, 344, and 434 form new intramolecular hydrogen-bond networks that strengthen the rigidity of the loop and closer packing around them.402 In summary, the case studies of C. antarctica CalB and C. rugosa Lip1 directed toward improving thermostability demonstrate the feasibility of ACS as a notable extension of BFIT.402,404 Together with Lip A, these lipases share the same canonical α/β-fold core structure but exhibit different size and structural complexities. As a minimal α/β hydrolase with only 181 amino acids, Lip A is the first successful example of achieving thermostability improvement through B-FIT.76 In contrast, it is interesting to note that most of the mutated residues in Lip1 are located within 10 Å of the catalytic residue Ser77. To evaluate whether distances from mutated residues to the catalytic residue serine have any influence on the thermostability, all single mutations in these three cases were carefully analyzed (Figures 19 and 20). It revealed that mutated flexible residues with a 60−100 relative B-factor and within 10 Å of the catalytic serine may be most effective hot spots for improving T5015 values of enzymes, especially when treating large proteins. Mutations beyond ∼14 Å and within ∼6 Å did not show significantly positive effects (Figure 21). However, to obtain a more reliable conclusion regarding active center residues as hot spots for engineering robustness of proteins, extensive ACS-based studies flanked by mechanistic, structural, and theoretical investigations are still needed, especially for clarifying the question whether thermodynamic or kinetic AA

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 11. Summary of Additional Typical B-FIT Studies of Enzymes for Enhanced Thermostabilization and/or Robustness enzyme

E.C.

organism

structures

comment

ref

X-ray X-ray X-ray X-ray X-ray

10 sites were selected and NNK-based SM was applied also involved ISM technology improved thermostability based on B-factor SM improved thermostability using B-FIT also improved thermostability using B-FIT

76 393 411 412 413

3.1.4.4

B. subtilis Thermomyces lanuginosus Rhizomucor miehei Glomerella cingulata Arthrobacter cholorphenolicus Streptomyces antibioticus

X-ray

414

eoxide hydrolase

3.3.2.3

A. niger

X-ray

type A feruloyl esterase endo-1,4-ß-galactanase endoglucanase I ADP-glucose pyrophosphorylase xylanase

3.1.1.73 3.2.1.89 3.2.1.4 2.7.7.27

A. usamii T. stipitatus T. reesei Zea mays

homology homology X-ray homology

single-point mutants and then combinations of double and triple mutants explored evolutionary pathways and enhanced stereoselectivity with maintained activity combined with PoPMusiC algorithm supported by consensus technique and PoPMusiC increased the activity in parallel in combination with ISM

3.2.1.8

homology

combined with focused epPCR

375

alcohol dehydrogenase PedE uronate dehydrogenase xylanase lipase PAL lipase C γ-lactamase

1.1.1.1

Psychrobacter sp. strain 2−17 Pseudomonas putida KT2440 Agrobacterium tumefaciens A. oryzae P. aeruginosa P. aeruginosa Microbacterium hydrocarbonoxydans

homology

combination of consensus data and SM

398

homology homology X-ray homology X-ray

consensus not possible; ΔT5015 = 3.2 °C ISM; Tm up to 61 °C SM for reducing stability (see section 3.3.11) 7-fold increased thermal stability; SM was applied eight flexible residues with high B-factor values were chosen for SM

415 305 416 417 306

lipase A lipase lipase cutinase choline oxidase

3.1.1.3 3.1.1.3 3.1.1.3 3.1.1.74 1.1.3.17

phospholipase D

1.1.1.203 3.2.1.8 3.1.1.3 3.1.1.3 3.5.2.B2

303,320 361 366 369 370

treatment, an unexplained phenomenon that the authors had already observed in other lipases. The hits were then examined more closely by subjecting them to a heat shock for 15 min using a temperature gradient of 27−67 °C followed by residual activity measurements. The most thermolabile mutant originating from library A proved to be Gly200Ala with a T5015 value of 51.6 °C (decrease by 20 °C), while the best variant from library C, Gly80Ser, showed a decrease of only 8.6 °C. Thereafter, limited ISM exploration was performed, e.g., using Gly200Ala as a template for NNK-based SM at residue 80 with evolution of the best variant Gly80Ala/Gly200Ala showing a T5015 value of 35.6 °C (lowering by 30 °C).416 This mutant showed no pronounced change in stereoselectivity of a model kinetic resolution. In summary, this study demonstrates the utility of B-factors when wanting to evolve programmed lability of proteins under defined conditions. Relative to WT, the best variant was found to have two glycines exchanged for alanines, a result that most theoretical models would not have predicted. Perhaps the glycine to alanine exchanges cause some kind of conformational strain, but a theoretical study has yet to be performed for elucidating the factors which contribute to such temperature adaptation, ideally supported by a crystal structure of the best mutant. Changes in B-factors would then be revealed.

Figure 22. Single-residue SM randomization sites for engineering thermolability in the lipase from P. aeruginosa (PAL): A (Gly200, yellow), B (Gly106, orange), C (Gly80, pink), D (Ser104, red), E (Tyr195, cyan), F (Ile79, green), and G (Val193, blue) illustrated in the WT PAL X-ray structure (PDB code 1ex9). Catalytic triad: Asp/His/ Ser. Reprinted with permission from ref 416. Copyright 2009 John Wiley and Sons.

4. CONCLUSIONS AND PERSPECTIVES As shown in this review, B-factors are being increasingly utilized in protein science for identifying and interpreting flexibility, dynamics, and rigidity of these biomolecules. They serve as useful guides when interpreting a variety of different types of protein characteristics. Moreover, they constitute experimental data which can be used to test the quality of computational methods for predicting flexibility and dynamics of proteins but also when computing B-factor values per se. Future research in this area will hopefully increase the reliability of such predictions and computations.

variant should rapidly lose its activity in a defined temperature window but should retain activity at lower temperatures in certain applications. In the screening step residual enzyme activity was monitored by a UV−vis plate reader using pnitrophenyl caprylate as the substrate at room temperature. In the first round of NNK-based SM, only libraries A and C harbored hits showing the desired temperature adaptation, i.e., those that displayed at least 30% loss in activity. Parenthetically, it was noted that WT PAL upon heat shock showed higher activity at room temperature than WT in the absence of such a AB

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

This review also reminds potential users of the limitations and possible pitfalls that need to be considered in upcoming studies (see sections 1.2 and 1.3). For example, when interpreting flexibility and dynamics, it must not be forgotten that a B-factor reflects both vibration and static disorder and that ideally multiple temperature measurements for separating the two effects should be made. Moreover, high-resolution X-ray structures are necessary for deriving reliable B-factors. B-factors alone do not suffice for truly innovative investigations, but they help scientists to plan and implement additional structural and mechanistic techniques. Studies that combine the readily available B-factor data with NMR spectroscopic information and/or computational results encompassing MD and QM/MM are particularly revealing. When focusing on enzymes, such a procedure provides novel insights regarding mechanistic intricacies, an area that is far from being mature despite giant strides during the past decades. Finally, experimental and more recently computed B-factors are now being used routinely as a guide for genetically engineering enhanced kinetic and thermodynamic stability of proteins based on rational design or structure-guided directed evolution. Unfortunately, in many published studies the term thermostability is used, but it is not reported whether kinetic or thermodynamic stability is involved. Therefore, in future research the difference between kinetic and thermodynamic stability needs to be investigated more so than in the past. Rational design and directed evolution are beginning to merge, as seen by the recent improvements in the use of the B-FIT method for enhancing protein robustness. Several computational methods exist which allow the prediction of B-factors which could be used to perform B-FIT experiments when X-ray data is not available, but currently it is not clear which one is most reliable. As already noted in section 3, B-FIT is certainly not the only approach to induce thermostabilization. Advanced computational techniques are of substantial help in more efficient rational design and as aids in focused semirational directed evolution, FireProt25,26 and Interactive Constraint Network Analysis27,28 being prominent examples. We expect these developments to continue in the future, including the utility of B-factors in the cryo-EM research area. Understanding the details of the relationship between dynamics of enzymes and their activity and selectivity continues to constitute a vital motivation for upcoming research in a challenging area.

2012 and then he moved to Nanyang Technological University in Singapore as a research fellow. One year later he moved to the MaxPlanck-Institute for Coal Research (Kohlenforschung) and PhilippsUniversity Marburg in Germany for a postdoctoral stay with Professor Manfred T. Reetz. In 2016 he became a full professor at Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, via the “CAS Pioneer Hundred Talents Program”. His research interests are in the discovery, design, and engineering of biocatalysts as well as cascade reactions design and metabolic engineering. Qian Liu received her B.Sc. degree in Biotechnology in 2007 and Ph.D. degree in Microbiology in 2012 from Shanghai Jiao Tong University, China, under the supervision of Professor Zixin-Deng. She then worked as a research associate in Professor Zixin-Deng’s group at Wuhan University and has been in Professor Yan Feng’s lab at Shanghai Jiao Tong University since 2015. She is interested in understanding the enzymology of natural products biosynthesis and applying protein engineering strategies for efficient catalysis. Ge Qu earned his Master’s degree in Biophysics (2010) with Professor Hongyu Zhang at the School of Life Sciences, Shandong University of Technology, China. In 2015 he received his Ph.D degree in Bioinformatics from Adam Mickiewicz University in Poland with Professor Wojciech Karlowski. Currently, he is working as a research associate at Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, in the group of Professor Zhoutong Sun. His major research interests are the discovery and structure-based engineering of enzymes that have potential as industrial biocatalysts. Yan Feng is a biochemist. She received her B.Sc. (1986) and M.S. (1989) degrees in Biochemistry from Jilin University and then her M.D. degree at Norman Bethune Medical University (1994). She has worked as a postdoctoral or visiting researcher at the National Institute of Bioscience and Human-Technology (Japan) and National Cancer Institute (United States). She started her faculty career at Jilin University in 1989. In 2009, she joined the Shanghai Jiao Tong University as a distinguished professor. Her research interests focus on molecular enzymology and synthetic biology. Manfred T. Reetz is a synthetic organic chemist who obtained his doctoral degree at Göttingen University, Germany, with Ulrich Schöllkopf. After a postdoctoral stay with Reinhard W. Hoffmann at Marburg University, he had several positions in Germany before becoming Full Professor at Marburg University in 1980. In 1991 he became Director at the Max-Planck-Institut für Kohlenforschung in Mülheim, where he began to experiment with biocatalysts which led to the concept of directed evolution of stereo- and regioselective enzymes as catalysts in organic chemistry and biotechnology. Following formal retirement in 2011, he accepted an emeritus position at Marburg University and in 2017 became Adjunct Professor at the Tianjin Institute of Industrial Biotechnology (Chinese Academy of Sciences).

AUTHOR INFORMATION Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. *E-mail: [email protected]. ORCID

ACKNOWLEDGMENTS M.T.R. thanks his former postdocs Daniel Carballeira and Andreas Vogel for initiating and performing research aimed at developing the B-FIT method and the Max-Planck-Society for generous financial support. He also acknowledges the support of the Chinese Academy of Sciences (CAS) President’s International Fellowship Initiative for 2018 (2018DB0030) as part of the “Distinguished Scientist” award. Z.S. thanks the CAS Pioneer Hundred Talents Program (No. 2016-053) and the Key Projects in the Tianjin Science & Technology Pillar Program (No. 15PTCYSY00020) for valuable support. Y.F. is thankful for the support in part by grants from the Natural Science

Zhoutong Sun: 0000-0002-9923-0951 Qian Liu: 0000-0002-6235-9065 Ge Qu: 0000-0002-0711-169X Yan Feng: 0000-0002-2522-2115 Manfred T. Reetz: 0000-0001-6819-6116 Notes

The authors declare no competing financial interest. Biographies Zhoutong Sun obtained his Ph.D. degree in Microbiology at Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, in AC

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(24) Heinzelman, P.; Snow, C. D.; Smith, M. A.; Yu, X.; Kannan, A.; Boulware, K.; Villalobos, A.; Govindarajan, S.; Minshull, J.; Arnold, F. H. SCHEMA recombination of a fungal cellulase uncovers a single mutation that contributes markedly to stability. J. Biol. Chem. 2009, 284, 26229−26233. (25) Bednar, D.; Beerens, K.; Sebestova, E.; Bendl, J.; Khare, S.; Chaloupkova, R.; Prokop, Z.; Brezovsky, J.; Baker, D.; Damborsky, J. FireProt: Energy- and evolution-based computational design of thermostable multiple-point mutants. PLoS Comput. Biol. 2015, 11, e1004556. (26) Musil, M.; Stourac, J.; Bendl, J.; Brezovsky, J.; Prokop, Z.; Zendulka, J.; Martinek, T.; Bednar, D.; Damborsky, J. FireProt: Web server for automated design of thermostable proteins. Nucleic Acids Res. 2017, 45, W393−W399. (27) Rathi, P. C.; Mulnaes, D.; Gohlke, H. VisualCNA: a GUI for interactive constraint network analysis and protein engineering for improving thermostability. Bioinformatics 2015, 31, 2394−2396. (28) Rathi, P. C.; Fulton, A.; Jaeger, K. E.; Gohlke, H. Application of rigidity theory to the thermostabilization of lipase A from Bacillus subtilis. PLoS Comput. Biol. 2016, 12, No. e1004754. (29) Korkegian, A.; Black, M. E.; Baker, D.; Stoddard, B. L. Computational thermostabilization of an enzyme. Science 2005, 308, 857−860. (30) Disfani, F. M.; Hsu, W.-L.; Mizianty, M. J.; Oldfield, C. J.; Xue, B.; Dunker, A. K.; Uversky, V. N.; Kurgan, L. MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics 2012, 28, I75−I83. (31) Yang, J.; Wang, Y.; Zhang, Y. ResQ: An approach to unified estimation of B-factor and residue-specific error in protein structure prediction. J. Mol. Biol. 2016, 428, 693−701. (32) Bramer, D.; Wei, G.-W. Multiscale weighted colored graphs for protein flexibility and rigidity analysis. J. Chem. Phys. 2018, 148, 054103. (33) Chen, J.; Guo, M.; Wang, X.; Liu, B. A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Briefings Bioinf. 2018, 19, 231−244. (34) Sheriff, S.; Hendrickson, W. A.; Stenkamp, R. E.; Sieker, L. C.; Jensen, L. H. Influence of solvent accessibility and intermolecular contacts on atomic mobilities in hemerythrins. Proc. Natl. Acad. Sci. U. S. A. 1985, 82, 1104−1107. (35) Ringe, D.; Petsko, G. A. Study of protein dynamics by X-ray diffraction. Methods Enzymol. 1986, 131, 389−433. (36) Ragone, R.; Facchiano, F.; Facchiano, A.; Facchiano, A. M.; Colonna, G. Flexibility plot of proteins. Protein Eng., Des. Sel. 1989, 2, 497−504. (37) Halle, B. Flexibility and packing in proteins. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 1274−1279. (38) Matthews, B. W. Structural and genetic analysis of protein stability. Annu. Rev. Biochem. 1993, 62, 139−160. (39) Engh, R.; Lobermann, H.; Schneider, M.; Wiegand, G.; Huber, R.; Laurell, C. B. The S variant of human alpha 1-antitrypsin, structure and implications for function and metabolism. Protein Eng., Des. Sel. 1989, 2, 407−415. (40) Kundhavai Natchiar, S.; Arockia Jeyaprakash, A.; Ramya, T. N.; Thomas, C. J.; Suguna, K.; Surolia, A.; Vijayan, M. Structural plasticity of peanut lectin: an X-ray analysis involving variation in pH, ligand binding and crystal structure. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2004, 60, 211−219. (41) Korndorfer, I.; Steipe, B.; Huber, R.; Tomschy, A.; Jaenicke, R. The crystal structure of holo-glyceraldehyde-3-phosphate dehydrogenase from the hyperthermophilic bacterium Thermotoga maritima at 2.5 A resolution. J. Mol. Biol. 1995, 246, 511−521. (42) Tainer, J. A.; Getzoff, E. D.; Alexander, H.; Houghten, R. A.; Olson, A. J.; Lerner, R. A.; Hendrickson, W. A. The reactivity of antipeptide antibodies is a function of the atomic mobility of sites in a protein. Nature 1984, 312, 127−134.

Foundation of China (No. 31620103901) and Ministry of Science and Technology (2017YFE0103300).

REFERENCES (1) Debye, P. Interference of X rays and heat movement. Ann. Phys. 1913, 348, 49−92. (2) Trueblood, K. N.; Bürgi, H.-B.; Burzlaff, H.; Dunitz, J. D.; Gramaccioli, C. M.; Schulz, H. H.; Shmueli, U.; Abrahams, S. C. Atomic dispacement parameter nomenclature. Report of a subcommittee on atomic displacement parameter nomenclature. Acta Crystallogr., Sect. A: Found. Crystallogr. 1996, A52, 770−781. (3) Sherwood, D.; Cooper, J. Crystals, X-rays and proteins; Oxford University Press: Oxford, 2011. (4) Drenth, J. Principles of protein X-ray crystallography; Springer: Stuttgart, 2006. (5) Brooks, C. L., III; Karplus, M.; Pettitt, B. M. Proteins: A theoretical perspective of dynamics, structure, and thermodynamics; Wiley: New York, 1988. (6) Karplus, P. A.; Schulz, G. E. Prediction of chain flexibility in proteins - A tool for the selection of peptide antigens. Naturwissenschaften 1985, 72, 212−213. (7) Schlessinger, A.; Rost, B. Protein flexibility and rigidity predicted from sequence. Proteins: Struct., Funct., Genet. 2005, 61, 115−126. (8) Vihinen, M. Relationship of protein flexibility to thermostability. Protein Eng., Des. Sel. 1987, 1, 477−480. (9) Parthasarathy, S.; Murthy, M. R. N. Protein thermal stability: insights from atomic displacement parameters (B values). Protein Eng., Des. Sel. 2000, 13, 9−13. (10) Yuan, Z.; Zhao, J.; Wang, Z. X. Flexibility analysis of enzyme active sites by crystallographic temperature factors. Protein Eng., Des. Sel. 2003, 16, 109−114. (11) Radivojac, P.; Obradovic, Z.; Smith, D. K.; Zhu, G.; Vucetic, S.; Brown, C. J.; Lawson, J. D.; Dunker, A. K. Protein flexibility and intrinsic disorder. Protein Sci. 2004, 13, 71−80. (12) Kuczera, K.; Kuriyan, J.; Karplus, M. Temperature dependence of the structure and dynamics of myoglobin. A simulation approach. J. Mol. Biol. 1990, 213, 351−373. (13) Bartlett, G. J.; Porter, C. T.; Borkakoti, N.; Thornton, J. M. Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 2002, 324, 105−121. (14) Qi, J.-X.; Jiang, F. The balance of flexibility and rigidity in the active site residues of Hen Egg white lysozyme. Chin. Phys. B 2011, 20, 058701. (15) Schlessinger, A.; Yachdav, G.; Rost, B. PROFbval: Predict flexible and rigid residues in proteins. Bioinformatics 2006, 22, 891−893. (16) Touw, W. G.; Vriend, G. BDB: Databank of PDB files with consistent B-factors. Protein Eng., Des. Sel. 2014, 27, 457−462. (17) Guo, X.; He, D.; Huang, L.; Liu, L.; Liu, L.; Yang, H. Strain energy in enzyme-substrate binding: An energetic insight into the flexibility versus rigidity of enzyme active site. Comput. Theor. Chem. 2012, 995, 17−23. (18) Huang, S.-W.; Yu, S.-H.; Shih, C.-H.; Guan, H.-W.; Huang, T.-T.; Hwang, J.-K. On the relationship between catalytic residues and their protein contact number. Curr. Protein Pept. Sci. 2011, 12, 574−579. (19) Alvarez-Garcia, D.; Barril, X. Relationship between protein flexibility and binding: Lessons for structure-based drug design. J. Chem. Theory Comput. 2014, 10, 2608−2614. (20) Gaspar, M. E.; Csermely, P. Rigidity and flexibility of biological networks. Briefings Funct. Genomics 2012, 11, 443−456. (21) Yang, L. W.; Bahar, I. Coupling between catalytic site and collective dynamics: A requirement for mechanochemical activity of enzymes. Structure 2005, 13, 893−904. (22) Yang, Z. R.; Thomson, R.; McNeil, P.; Esnouf, R. M. RONN: The bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 2005, 21, 3369−3376. (23) Pantazes, R. J.; Grisewood, M. J.; Maranas, C. D. Recent advances in computational protein design. Curr. Opin. Struct. Biol. 2011, 21, 467−472. AD

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(43) Nolte, R. T.; Wisely, G. B.; Westin, S.; Cobb, J. E.; Lambert, M. H.; Kurokawa, R.; Rosenfeld, M. G.; Willson, T. M.; Glass, C. K.; Milburn, M. V.; et al. Ligand binding and co-activator assembly of the peroxisome proliferator-activated receptor-gamma. Nature 1998, 395, 137−143. (44) Wang, Y.; Makowski, L. Fine structure of conformational ensembles in adenylate kinase. Proteins: Struct., Funct., Genet. 2018, 86, 332−343. (45) Keedy, D. A. Conformational and connotational heterogeneity: A surprising relationship between protein structural flexibility and puns. Proteins: Struct., Funct., Genet. 2015, 83, 797−798. (46) Frauenfelder, H.; Sligar, S. G.; Wolynes, P. G. The energy landscapes and motions of proteins. Science 1991, 254, 1598−1603. (47) Davis, I. W.; Arendall, W. B.; Richardson, D. C.; Richardson, J. S. The backrub motion: How protein backbone shrugs when a sidechain dances. Structure 2006, 14, 265−274. (48) Lang, P. T.; Ng, H. L.; Fraser, J. S.; Corn, J. E.; Echols, N.; Sales, M.; Holton, J. M.; Alber, T. Automated electron-density sampling reveals widespread conformational polymorphism in proteins. Protein Sci. 2010, 19, 1420−1431. (49) Fraser, J. S.; Clarkson, M. W.; Degnan, S. C.; Erion, R.; Kern, D.; Alber, T. Hidden alternative structures of proline isomerase essential for catalysis. Nature 2009, 462, 669−673. (50) Kamerlin, S. C. L.; Warshel, A. At the dawn of the 21st century: Is dynamics the missing link for understanding enzyme catalysis? Proteins: Struct., Funct., Genet. 2010, 78, 1339−1375. (51) Wei, G. H.; Xi, W. H.; Nussinov, R.; Ma, B. Y. Protein ensembles: How does nature harness thermodynamic fluctuations for life? The diverse functional roles of conformational ensembles in the cell. Chem. Rev. 2016, 116, 6516−6551. (52) Hammes-Schiffer, S.; Benkovic, S. J. Relating protein motion to catalysis. Annu. Rev. Biochem. 2006, 75, 519−541. (53) Callender, R.; Dyer, R. B. The dynamical nature of enzymatic catalysis. Acc. Chem. Res. 2015, 48, 407−413. (54) Johansson, K. E.; Lindorff-Larsen, K. Structural heterogeneity and dynamics in protein evolution and design. Curr. Opin. Struct. Biol. 2018, 48, 157−163. (55) Masgrau, L.; Truhlar, D. G. The importance of ensemble averaging in enzyme kinetics. Acc. Chem. Res. 2015, 48, 431−438. (56) Bhabha, G.; Biel, J. T.; Fraser, J. S. Keep on moving: discovering and perturbing the conformational dynamics of enzymes. Acc. Chem. Res. 2015, 48, 423−430. (57) Hanoian, P.; Liu, C. T.; Hammes-Schiffer, S.; Benkovic, S. Perspectives on electrostatics and conformational motions in enzyme catalysis. Acc. Chem. Res. 2015, 48, 482−489. (58) Gurevich, V. V.; Gurevich, E. V.; Uversky, V. N. Arrestins: Structural disorder creates rich functionality. Protein Cell 2018, 9, 986− 1003. (59) Fokas, A. S.; Cole, D. J.; Ahnert, S. E.; Chin, A. W. Residue geometry networks: A rigidity-based approach to the amino acid network and evolutionary rate analysis. Sci. Rep. 2016, 6, 33213. (60) Karplus, M. Development of multiscale models for complex chemical systems: From H+H(2) to biomolecules (Nobel Lecture). Angew. Chem., Int. Ed. 2014, 53, 9992−10005. (61) Levitt, M. Birth and future of multiscale modeling for macromolecular systems (Nobel Lecture). Angew. Chem., Int. Ed. 2014, 53, 10006−10018. (62) Warshel, A. Multiscale modeling of biological functions: From enzymes to molecular machines (Nobel Lecture). Angew. Chem., Int. Ed. 2014, 53, 10020−10031. (63) Feyfant, E.; Sali, A.; Fiser, A. Modeling mutations in protein structures. Protein Sci. 2007, 16, 2030−2041. (64) Hawkins, P. C.; Warren, G. L.; Skillman, A. G.; Nicholls, A. How to do an evaluation: Pitfalls and traps. J. Comput.-Aided Mol. Des. 2008, 22, 179−190. (65) Trevino, S. R.; Schaefer, S.; Scholtz, J. M.; Pace, C. N. Increasing protein conformational stability by optimizing beta-turn sequence. J. Mol. Biol. 2007, 373, 211−218.

(66) Carugo, O. How large B-factors can be in protein crystal structures. BMC Bioinf. 2018, 19, 61. (67) Na, H.; Song, G. The performance of fine-grained and coarsegrained elastic network models and its dependence on various factors. Proteins: Struct., Funct., Genet. 2015, 83, 1273−1283. (68) Liu, Q.; Li, Z.; Li, J. Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinf. 2014, 15, S3. (69) Merritt, E. A. To B or not to B: a question of resolution? Acta Crystallogr., Sect. D: Biol. Crystallogr. 2012, 68, 468−477. (70) Kuriyan, J.; Karplus, M.; Petsko, G. A. Estimation of uncertainties in X-ray refinement results by use of perturbed structures. Proteins: Struct., Funct., Genet. 1987, 2, 1−12. (71) Wilson, A. J. C. The probability distribution of X-ray intensities. Acta Crystallogr. 1949, 2, 318−321. (72) Zwart, P. H.; Lamzin, V. S. The influence of positional errors on the Debye effects. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2004, 60, 220−226. (73) Morris, R. J.; Zwart, P. H.; Cohen, S.; Fernandez, F. J.; Kakaris, M.; Kirillova, O.; Vonrhein, C.; Perrakis, A.; Lamzin, V. S. Breaking good resolutions with ARP/wARP. J. Synchrotron Radiat. 2004, 11, 56− 59. (74) Blessing, R. H.; Guo, D. Y.; Langs, D. A. Statistical expectation value of the Debye-Waller factor and E(hkl) values for macromolecular crystals. Acta Crystallogr., Sect. D: Biol. Crystallogr. 1996, 52, 257−266. (75) Zhang, H.; Kurgan, L. Improved prediction of residue flexibility by embedding optimized amino acid grouping into RSA-based linear models. Amino Acids 2014, 46, 2665−2680. (76) Reetz, M. T.; Carballeira, J. D.; Vogel, A. Iterative saturation mutagenesis on the basis of B factors as a strategy for increasing protein thermostability. Angew. Chem., Int. Ed. 2006, 45, 7745−7751. (77) Reetz, M. T.; Carballeira, J. D. Iterative saturation mutagenesis (ISM) for rapid directed evolution of functional enzymes. Nat. Protoc. 2007, 2, 891−903. (78) Born, M. Theoretical investigations on the relation between crystal dynamics and X-ray scattering. Rep. Prog. Phys. 1942, 9, 294− 333. (79) Creighton, T. E. Proteins: Structures and molecular properties, 2nd ed.; W. H. Freeman: New York, 1993. (80) Vihinen, M.; Torkkila, E.; Riikonen, P. Accuracy of protein flexibility predictions. Proteins: Struct., Funct., Genet. 1994, 19, 141− 149. (81) Parthasarathy, S.; Murthy, M. R. Analysis of temperature factor distribution in high-resolution protein structures. Protein Sci. 1997, 6, 2561−2567. (82) Tronrud, D. E. Knowledge-based B-factor restraints for the refinement of proteins. J. Appl. Crystallogr. 1996, 29, 100−104. (83) Chayen, N. E.; Helliwell, J. R.; Snell, E. H. Macromolecular Crystallization and Crystal Perfection.; Oxford University Press, 2010. (84) Radivojac, P.; Brown, C. Personal communication to M. T. Reetz, Sept 5, 2018. (85) Burmester, J.; Spinelli, S.; Pugliese, L.; Krebber, A.; Honegger, A.; Jung, S.; Schimmele, B.; Cambillau, C.; Plückthun, A. Selection, characterization and X-ray structure of anti-ampicillin single-chain Fv fragments from phage-displayed murine antibody libraries. J. Mol. Biol. 2001, 309, 671−685. (86) Kim, S.-Y.; Hwang, K. Y.; Kim, S.-H.; Sung, H.-C.; Han, Y. S.; Cho, Y. Structural basis for cold adaptation sequence, biochemical properties, and crystal structure of malate dehydrogenase from a psychrophile Aquaspirillium arcticum. J. Biol. Chem. 1999, 274, 11761− 11767. (87) Merlino, A.; Krauss, I. R.; Castellano, I.; Vendittis, E. D.; Rossi, B.; Conte, M.; Vergara, A.; Sica, F. Structure and flexibility in coldadapted iron superoxide dismutases: The case of the enzyme isolated from Pseudoalteromonas haloplanktis. J. Struct. Biol. 2010, 172, 343− 352. (88) Liu, L.; Fang, Y.; Wu, J. Flexibility is a mechanical determinant of antimicrobial activity for amphipathic cationic alpha-helical antimicroAE

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

bial peptides. Biochim. Biophys. Acta, Biomembr. 2013, 1828, 2479− 2486. (89) Lin, J.; Xiao, B.; Li, Q.; Fang, Y.; Wu, J. An insight into biomolecular flexibility: Its measuring, modeling and regulating on function at single molecule level. Mol. Cell. Biomech. 2018, 15, 37−49. (90) Kuzmanic, A.; Pannu, N. S.; Zagrovic, B. X-ray refinement significantly underestimates the level of microscopic heterogeneity in biomolecular crystals. Nat. Commun. 2014, 5, 3220. (91) Bond, P. J.; Faraldo-Gómez, J. D.; Deol, S. S.; Sansom, M. S. P. Membrane protein dynamics and detergent interactions within a crystal: A simulation study of OmpA. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 9518−9523. (92) Clore, G. M.; Schwieters, C. D. Concordance of residual dipolar couplings, backbone order parameters and crystallographic B-factors for a small α/β protein: A unified picture of high probability, fast atomic motions in proteins. J. Mol. Biol. 2006, 355, 879−886. (93) Yang, L.-W.; Eyal, E.; Chennubhotla, C.; Jee, J.; Gronenborn, A. M.; Bahar, I. Insights into equilibrium dynamics of proteins from comparison of NMR and X-ray data with computational predictions. Structure 2007, 15, 741−749. (94) Lu, Q.; Tan, Y.-H.; Luo, R. Molecular dynamics simulations of p53 DNA-binding domain. J. Phys. Chem. B 2007, 111, 11538−11545. (95) Fontana, A.; Spolaore, B.; Mero, A.; Veronese, F. M. Site-specific modification and PEGylation of pharmaceutical proteins mediated by transglutaminase. Adv. Drug Delivery Rev. 2008, 60, 13−28. (96) Rao, M. K.; Chapman, T. R.; Finke, J. M. Crystallographic Bfactors highlight energetic frustration in aldolase folding. J. Phys. Chem. B 2008, 112, 10417−10431. (97) Zhang, H.; Zhang, T.; Chen, K.; Shen, S.; Ruan, J.; Kurgan, L. On the relation between residue flexibility and local solvent accessibility in proteins. Proteins: Struct., Funct., Genet. 2009, 76, 617−636. (98) Li, D. W.; Bruschweiler, R. All-atom contact model for understanding protein dynamics from crystallographic B-factors. Biophys. J. 2009, 96, 3074−3081. (99) Kuzmanic, A.; Zagrovic, B. Determination of ensemble-average pairwise root mean-square deviation from experimental B-factors. Biophys. J. 2010, 98, 861−871. (100) Kulathila, R.; Kulathila, R.; Indic, M.; van den Berg, B. Crystal structure of Escherichia coli CusC, the outer membrane component of a heavy metal efflux pump. PLoS One 2011, 6, No. e15610. (101) Welch, B. D.; Liu, Y.; Kors, C. A.; Leser, G. P.; Jardetzky, T. S.; Lamb, R. A. Structure of the cleavage-activated prefusion form of the parainfluenza virus 5 fusion protein. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 16672−16677. (102) Schreiner, M.; Schlesinger, R.; Heberle, J.; Niemann, H. H. Structure of Halorhodopsin from Halobacterium salinarum in a new crystal form that imposes little restraint on the E-F loop. J. Struct. Biol. 2015, 190, 373−378. (103) Schreiner, M.; Schlesinger, R.; Heberle, J.; Niemann, H. H. Crystal structure of Halobacterium salinarum halorhodopsin with a partially depopulated primary chloride-binding site. Acta Crystallogr., Sect. F: Struct. Biol. Commun. 2016, 72, 692−699. (104) Henderson, R.; McMullan, G. Problems in obtaining perfect images by single-particle electron cryomicroscopy of biological structures in amorphous ice. Microscopy (Oxford, U. K.) 2013, 62, 43−50. (105) Fraser, J. S.; van den Bedem, H.; Samelson, A. J.; Lang, P. T.; Holton, J. M.; Echols, N.; Alber, T. Accessing protein conformational ensembles using room-temperature X-ray crystallography. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 16247−16252. (106) Halle, B. Biomolecular cryocrystallography: Structural changes during flash-cooling. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 4793− 4798. (107) Wlodawer, A.; Li, M.; Dauter, Z. High-resolution cryo-EM maps and models: a crystallographer’s perspective. Structure 2017, 25, 1589−1597. (108) Li, X.; Anderson, M.; Collin, D.; Muegge, I.; Wan, J.; Brennan, D.; Kugler, S.; Terenzio, D.; Kennedy, C.; Lin, S.; et al. Structural studies unravel the active conformation of apo RORγt nuclear receptor

and a common inverse agonism of two diverse classes of RORγt inhibitors. J. Biol. Chem. 2017, 292, 11618−11630. (109) Sutton, B. J.; Davies, A. M. Structure and dynamics of IgEreceptor interactions: Fc epsilon RI and CD23/Fc epsilon RII. Immunol. Rev. 2015, 268, 222−235. (110) Wurzburg, B. A.; Jardetzky, T. S. Conformational flexibility in immunoglobulin E-Fc(3−4) revealed in multiple crystal forms. J. Mol. Biol. 2009, 393, 176−190. (111) Garman, S. C.; Wurzburg, B. A.; Tarchevskaya, S. S.; Kinet, J. P.; Jardetzky, T. S. Structure of the Fc fragment of human IgE bound to its high-affinity receptor Fc epsilon RI alpha. Nature 2000, 406, 259−266. (112) Borthakur, S.; Andrejeva, G.; McDonnell, J. M. Basis of the intrinsic flexibility of the C epsilon 3 domain of IgE. Biochemistry 2011, 50, 4608−4614. (113) Dhaliwal, B.; Pang, M. O. Y.; Yuan, D. P.; Beavil, A. J.; Sutton, B. J. A range of C epsilon 3-C epsilon 4 interdomain angles in IgE Fc accommodate binding to its receptor CD23. Acta Crystallogr., Sect. F: Struct. Biol. Commun. 2014, 70, 305−309. (114) Wurzburg, B. A.; Garman, S. C.; Jardetzky, T. S. Structure of the human IgE-Fc C epsilon 3-C epsilon 4 reveals conformational flexibility in the antibody effector domains. Immunity 2000, 13, 375−385. (115) Dore, K. A.; Davies, A. M.; Drinkwater, N.; Beavil, A. J.; McDonnell, J. M.; Sutton, B. J. Thermal sensitivity and flexibility of the C epsilon 3 domains in immunoglobulin E. Biochim. Biophys. Acta, Proteins Proteomics 2017, 1865, 1336−1347. (116) Hynes, R. O. Integrins: Bidirectional, allosteric signaling machines. Cell 2002, 110, 673−687. (117) Coller, B. S.; Shattil, S. J. The GPIIb/IIIa (integrin alpha IIb beta 3) odyssey: A technology-driven saga of a receptor with twists, turns, and even a bend. Blood 2008, 112, 3011−3025. (118) Goguet, M.; Narwani, T. J.; Petermann, R.; Jallu, V.; de Brevern, A. G. In silico analysis of Glanzmann variants of Calf-1 domain of αIIbβ3 integrin revealed dynamic allosteric effect. Sci. Rep. 2017, 7, 8001. (119) The PyMOL Molecular Graphics System; Schrödinger, LLC; http://www.pymol.org (Accessed Dec 4, 2018). (120) Carugo, O.; Argos, P. Protein-protein crystal-packing contacts. Protein Sci. 1997, 6, 2261−2263. (121) Bornot, A.; Etchebest, C.; de Brevern, A. G. A new prediction strategy for long local protein structures using an original description. Proteins: Struct., Funct., Genet. 2009, 76, 570−587. (122) Hamnevik, E.; Enugala, T. R.; Maurer, D.; Ntuku, S.; Oliveira, A.; Dobritzsch, D.; Widersten, M. Relaxation of nonproductive binding and increased rate of coenzyme release in an alcohol dehydrogenase increases turnover with a nonpreferred alcohol enantiomer. FEBS J. 2017, 284, 3895−3914. (123) Fenwick, R. B.; van den Bedem, H.; Fraser, J. S.; Wright, P. E. Integrated description of protein dynamics from room-temperature Xray crystallography and NMR. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, E445−454. (124) Lim, S. S.; Yang, W.; Krishnarjuna, B.; Kannan Sivaraman, K.; Chandrashekaran, I. R.; Kass, I.; MacRaild, C. A.; Devine, S. M.; Debono, C. O.; Anders, R. F.; et al. Structure and dynamics of apical membrane antigen 1 from Plasmodium falciparum FVO. Biochemistry 2014, 53, 7310−7320. (125) Schneider, B.; Gelly, J. C.; de Brevern, A. G.; Cerny, J. Local dynamics of proteins and DNA evaluated from crystallographic B factors. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2014, 70, 2413−2419. (126) Shahmoradi, A.; Sydykova, D. K.; Spielman, S. J.; Jackson, E. L.; Dawson, E. T.; Meyer, A. G.; Wilke, C. O. Predicting evolutionary site variability from structure in viral proteins: buriedness, packing, flexibility, and design. J. Mol. Evol. 2014, 79, 130−142. (127) Kesharwani, S. S.; Nandekar, P. P.; Pragyan, P.; Sangamwar, A. T. Comparative proteomics among cytochrome p450 family 1 for differential substrate specificity. Protein J. 2014, 33, 536−548. (128) Jiang, Y.; Li, L.; Zhang, H.; Feng, W.; Tan, T. Lid closure mechanism of Yarrowia lipolytica lipase in methanol investigated by molecular dynamics simulation. J. Chem. Inf. Model. 2014, 54, 2033− 2041. AF

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

metal binding to a cold-adapted frataxin. JBIC, J. Biol. Inorg. Chem. 2015, 20, 653−664. (149) Said, A. M.; Hangauer, D. G. Binding cooperativity between a ligand carbonyl group and a hydrophobic side chain can be enhanced by additional H-bonds in a distance dependent manner: A case study with thrombin inhibitors. Eur. J. Med. Chem. 2015, 96, 405−424. (150) Ferreira, L. A.; Madeira, P. P.; Uversky, A. V.; Uversky, V. N.; Zaslavsky, B. Y. Responses of proteins to different ionic environment are linearly interrelated. J. Chromatogr. A 2015, 1387, 32−41. (151) Fuchs, J. E.; Waldner, B. J.; Huber, R. G.; von Grafenstein, S.; Kramer, C.; Liedl, K. R. Independent metrics for protein backbone and side-chain flexibility: Time scales and effects of ligand binding. J. Chem. Theory Comput. 2015, 11, 851−860. (152) Singharoy, A.; Teo, I.; McGreevy, R.; Stone, J. E.; Zhao, J.; Schulten, K. Molecular dynamics-based refinement and validation for sub-5 A cryo-electron microscopy maps. eLife 2016, 5, No. e16105. (153) Nguyen, D. D.; Xia, K.; Wei, G. W. Generalized flexibilityrigidity index. J. Chem. Phys. 2016, 144, 234106. (154) Cha, S. S.; An, Y. J. Crystal structure of EstSRT1, a family VIII carboxylesterase displaying hydrolytic activity toward oxyimino cephalosporins. Biochem. Biophys. Res. Commun. 2016, 478, 818−824. (155) Erman, B. Universal features of fluctuations in globular proteins. Proteins: Struct., Funct., Genet. 2016, 84, 721−725. (156) Prabhakar, P. K.; Srivastava, A.; Rao, K. K.; Balaji, P. V. Monomerization alters the dynamics of the lid region in Campylobacter jejuni CstII: an MD simulation study. J. Biomol. Struct. Dyn. 2016, 34, 778−791. (157) Neeb, M.; Hohn, C.; Ehrmann, F. R.; Hartsch, A.; Heine, A.; Diederich, F.; Klebe, G. Occupying a flat subpocket in a tRNAmodifying enzyme with ordered or disordered side chains: Favorable or unfavorable for binding? Bioorg. Med. Chem. 2016, 24, 4900−4910. (158) Kuo, T. H.; Li, K. B. Predicting protein-protein interaction sites using sequence descriptors and site propensity of neighboring amino acids. Int. J. Mol. Sci. 2016, 17, No. 1788. (159) Silvaroli, J. A.; Arne, J. M.; Chelstowska, S.; Kiser, P. D.; Banerjee, S.; Golczak, M. Ligand binding induces conformational changes in human cellular retinol-binding protein 1 (CRBP1) revealed by atomic resolution crystal structures. J. Biol. Chem. 2016, 291, 8528− 8540. (160) Kosciolek, T.; Buchan, D. W. A.; Jones, D. T. Predictions of backbone dynamics in intrinsically disordered proteins using de novo fragment-based protein structure predictions. Sci. Rep. 2017, 7, 6999. (161) Suyama, Y.; Muraki, N.; Kusunoki, M.; Miyake, H. Crystal structure of the starch-binding domain of glucoamylase from Aspergillus niger. Acta Crystallogr., Sect. F: Struct. Biol. Commun. 2017, 73, 550−554. (162) Cong, Y.; Li, M.; Feng, G.; Li, Y.; Wang, X.; Duan, L. TrypsinLigand binding affinities calculated using an effective interaction entropy method under polarized force field. Sci. Rep. 2017, 7, 17708. (163) Qiu, Z.; Zhou, B.; Yuan, J. Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance. J. Theor. Biol. 2017, 433, 57−63. (164) Bope, C. D.; Tong, D.; Li, X.; Lu, L. Fluctuation matching approach for elastic network model and structure-based model of biomacromolecules. Prog. Biophys. Mol. Biol. 2017, 128, 100−112. (165) Chang, K. J.; Kuo, Y. H.; Chiang, Y. W. Study of Protein Dynamics under Nanoconfinement by Spin-Label ESR: A Case of T4 Lysozyme Protein. J. Phys. Chem. B 2017, 121, 4355−4363. (166) Menozzi, I.; Vallese, F.; Polverini, E.; Folli, C.; Berni, R.; Zanotti, G. Structural and molecular determinants affecting the interaction of retinol with human CRBP1. J. Struct. Biol. 2017, 197, 330−339. (167) Pearce, N. M.; Krojer, T.; von Delft, F. Proper modelling of ligand binding requires an ensemble of bound and unbound states. Acta Crystallogr. D Struct. Biol. 2017, 73, 256−266. (168) Wang, J. Experimental charge density from electron microscopic maps. Protein Sci. 2017, 26, 1619−1626.

(129) Chen, C. C.; Luo, H.; Han, X.; Lv, P.; Ko, T. P.; Peng, W.; Huang, C. H.; Wang, K.; Gao, J.; Zheng, Y.; et al. Structural perspectives of an engineered beta-1,4-xylanase with enhanced thermostability. J. Biotechnol. 2014, 189, 175−182. (130) Wakamori, M.; Fujii, Y.; Suka, N.; Shirouzu, M.; Sakamoto, K.; Umehara, T.; Yokoyama, S. Intra- and inter-nucleosomal interactions of the histone H4 tail revealed with a human nucleosome core particle with genetically-incorporated H4 tetra-acetylation. Sci. Rep. 2015, 5, 17204. (131) Liu, Q.; Ren, J.; Song, J.; Li, J. Co-occurring atomic contacts for the characterization of protein binding hot spots. PLoS One 2015, 10, No. e0144486. (132) Sun, H.; Zhao, L.; Peng, S.; Huang, N. Incorporating replacement free energy of binding-site waters in molecular docking. Proteins: Struct., Funct., Genet. 2014, 82, 1765−1776. (133) Gao, K.; He, H.; Yang, M.; Yan, H. Molecular dynamics simulations of the Escherichia coli HPPK apo-enzyme reveal a network of conformational transitions. Biochemistry 2015, 54, 6734−6742. (134) Dong, Q.; Wang, K.; Liu, B.; Liu, X. Characterization and prediction of protein flexibility based on structural alphabets. BioMed Res. Int. 2016, 2016, 4628025. (135) Dror, A.; Kanteev, M.; Kagan, I.; Gihaz, S.; Shahar, A.; Fishman, A. Structural insights into methanol-stable variants of lipase T6 from Geobacillus stearothermophilus. Appl. Microbiol. Biotechnol. 2015, 99, 9449−9461. (136) Avgy-David, H. H.; Senderowitz, H. Toward focusing conformational ensembles on bioactive conformations: A molecular mechanics/quantum mechanics study. J. Chem. Inf. Model. 2015, 55, 2154−2167. (137) Mou, Y.; Huang, P. S.; Thomas, L. M.; Mayo, S. L. Using molecular dynamics simulations as an aid in the prediction of domain swapping of computationally designed protein variants. J. Mol. Biol. 2015, 427, 2697−2706. (138) Xia, K.; Opron, K.; Wei, G. W. Multiscale gaussian network model (mGNM) and multiscale anisotropic network model (mANM). J. Chem. Phys. 2015, 143, 204106. (139) Opron, K.; Xia, K.; Wei, G. W. Communication: Capturing protein multiscale thermal fluctuations. J. Chem. Phys. 2015, 142, 211101. (140) Bayden, A. S.; Moustakas, D. T.; Joseph-McCarthy, D.; Lamb, M. L. Evaluating free energies of binding and conservation of crystallographic waters using SZMAP. J. Chem. Inf. Model. 2015, 55, 1552−1565. (141) Kim, M. H.; Lee, B. H.; Kim, M. K. Robust elastic network model: A general modeling for precise understanding of protein dynamics. J. Struct. Biol. 2015, 190, 338−347. (142) Kowalska-Baron, A.; Galecki, K.; Wysocki, S. Room temperature phosphorescence study on the structural flexibility of single tryptophan containing proteins. Spectrochim. Acta, Part A 2015, 134, 380−387. (143) DiMaio, F.; Song, Y.; Li, X.; Brunner, M. J.; Xu, C.; Conticello, V.; Egelman, E.; Marlovits, T.; Cheng, Y.; Baker, D. Atomic-accuracy models from 4.5-A cryo-electron microscopy data with density-guided iterative local refinement. Nat. Methods 2015, 12, 361−365. (144) Spiegel, M.; Duraisamy, A. K.; Schroder, G. F. Improving the visualization of cryo-EM density reconstructions. J. Struct. Biol. 2015, 191, 207−213. (145) Kalaivani, R.; Srinivasan, N. A Gaussian network model study suggests that structural fluctuations are higher for inactive states than active states of protein kinases. Mol. BioSyst. 2015, 11, 1079−1095. (146) Chakravarty, D.; Janin, J.; Robert, C. H.; Chakrabarti, P. Changes in protein structure at the interface accompanying complex formation. IUCrJ 2015, 2, 643−652. (147) Kawato, T.; Mizohata, E.; Meshizuka, T.; Doi, H.; Kawamura, T.; Matsumura, H.; Yumura, K.; Tsumoto, K.; Kodama, T.; Inoue, T.; et al. Crystal structure of streptavidin mutant with low immunogenicity. J. Biosci. Bioeng. 2015, 119, 642−647. (148) Noguera, M. E.; Roman, E. A.; Rigal, J. B.; Cousido-Siah, A.; Mitschler, A.; Podjarny, A.; Santos, J. Structural characterization of AG

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(169) Dehouck, Y.; Bastolla, U. The maximum penalty criterion for ridge regression: application to the calibration of the force constant in elastic network models. Integr. Biol. (Camb.) 2017, 9, 627−641. (170) Jia, L.; Sun, Y. Protein asparagine deamidation prediction based on structures with machine learning methods. PLoS One 2017, 12, No. e0181347. (171) Yekwa, E.; Khourieh, J.; Canard, B.; Papageorgiou, N.; Ferron, F. Activity inhibition and crystal polymorphism induced by active-site metal swapping. Acta Crystallogr. D Struct. Biol. 2017, 73, 641−649. (172) Wang, Y.; Virtanen, J.; Xue, Z.; Zhang, Y. I-TASSER-MR: automated molecular replacement for distant-homology proteins using iterative fragment assembly and progressive sequence truncation. Nucleic Acids Res. 2017, 45, W429−W434. (173) Guo, J.; Coker, A. R.; Wood, S. P.; Cooper, J. B.; Chohan, S. M.; Rashid, N.; Akhtar, M. Structure and function of the thermostable Lasparaginase from Thermococcus kodakarensis. Acta Crystallogr. D Struct. Biol. 2017, 73, 889−895. (174) Xia, K. L. Multiscale virtual particle based elastic network model (MVP-ENM) for normal mode analysis of large-sized biomolecules. Phys. Chem. Chem. Phys. 2018, 20, 658−669. (175) Trejo, C. S.; Rock, R. S.; Stark, W. M.; Boocock, M. R.; Rice, P. A. Snapshots of a molecular swivel in action. Nucleic Acids Res. 2018, 46, 5286−5296. (176) Fresco-Taboada, A.; Fernández-Lucas, J.; Acebal, C.; Arroyo, M.; Ramón, F.; de la Mata, I.; Mancheño, J. 2′-Deoxyribosyltransferase from Bacillus psychrosaccharolyticus: A mesophilic-like biocatalyst for the synthesis of modified nucleosides from a psychrotolerant bacterium. Catalysts 2018, 8, 8. (177) Abella, J. R.; Moll, M.; Kavraki, L. E. Maintaining and enhancing diversity of sampled protein conformations in robotics-inspired methods. J. Comput. Biol. 2018, 25, 3−20. (178) Hellerschmied, D.; Roessler, M.; Lehner, A.; Gazda, L.; Stejskal, K.; Imre, R.; Mechtler, K.; Dammermann, A.; Clausen, T. UFD-2 is an adaptor-assisted E3 ligase targeting unfolded proteins. Nat. Commun. 2018, 9, 484. (179) Blaisse, M. R.; Fu, B.; Chang, M. C. Y. Structural and biochemical studies of substrate selectivity in Ascaris suum Thiolases. Biochemistry 2018, 57, 3155−3166. (180) Dong, Y. W.; Liao, M. L.; Meng, X. L.; Somero, G. N. Structural flexibility and protein adaptation to temperature: Molecular dynamics analysis of malate dehydrogenases of marine molluscs. Proc. Natl. Acad. Sci. U. S. A. 2018, 115, 1274−1279. (181) Campbell, E.; Kaltenbach, M.; Correy, G. J.; Carr, P. D.; Porebski, B. T.; Livingstone, E. K.; Afriat-Jurnou, L.; Buckle, A. M.; Weik, M.; Hollfelder, F.; et al. The role of protein dynamics in the evolution of new enzyme function. Nat. Chem. Biol. 2016, 12, 944−950. (182) Constantinides, A.; Severin, C.; Gumpper, R.; Zheng, X.; Luo, M. Characterization of the PB2 cap binding domain accelerates inhibitor design. Crystals 2018, 8, 62. (183) Brown, K. L.; Banerjee, S.; Feigley, A.; Abe, H.; Blackwell, T. S.; Pozzi, A.; Hudson, B. G.; Zent, R. Salt-bridge modulates differential calcium-mediated ligand binding to integrin alpha1- and alpha2-I domains. Sci. Rep. 2018, 8, 2916. (184) Yu, L. J.; Suga, M.; Wang-Otomo, Z. Y.; Shen, J. R. Structure of photosynthetic LH1-RC supercomplex at 1.9 A resolution. Nature 2018, 556, 209−213. (185) Rashid, M.; Bera, S.; Medvinsky, A. B.; Sun, G.-Q.; Li, B.-L.; Chakraborty, A. Adaptive regulation of nitrate transceptor NRT1.1 in fluctuating soil nitrate conditions. iScience 2018, 2, 41−50. (186) Daskalakis, V. Protein-protein interactions within photosystem II under photoprotection: the synergy between CP29 minor antenna, subunit S (PsbS) and zeaxanthin at all-atom resolution. Phys. Chem. Chem. Phys. 2018, 20, 11843−11855. (187) Petri, J.; Shimaki, Y.; Jiao, W.; Bridges, H. R.; Russell, E. R.; Parker, E. J.; Aragao, D.; Cook, G. M.; Nakatani, Y. Structure of the NDH-2 − HQNO inhibited complex provides molecular insight into quinone-binding site inhibitors. Biochim. Biophys. Acta, Bioenerg. 2018, 1859, 482−490.

(188) Earl, C.; Bagneris, C.; Zeman, K.; Cole, A.; Barrett, T.; Savva, R. A structurally conserved motif in gamma-herpesvirus uracil-DNA glycosylases elicits duplex nucleotide-flipping. Nucleic Acids Res. 2018, 46, 4286−4300. (189) Khazina, E.; Weichenrieder, O. Human LINE-1 retrotransposition requires a metastable coiled coil and a positively charged Nterminus in L1ORF1p. eLife 2018, 7, 34960. (190) Linde, M.; Heyn, K.; Merkl, R.; Sterner, R.; Babinger, P. Hexamerization of deranylgeranylglyceryl phosphate synthase ensures structural integrity and catalytic activity at high temperatures. Biochemistry 2018, 57, 2335. (191) Papp-Wallace, K. M.; Nguyen, N. Q.; Jacobs, M. R.; Bethel, C. R.; Barnes, M. D.; Kumar, V.; Bajaksouzian, S.; Rudin, S. D.; Rather, P. N.; Bhavsar, S.; et al. Strategic approaches to overcome resistance against Gram negative pathogens using beta-lactamase inhibitors and beta-lactam enhancers: The activity of three novel diazabicyclooctanes, WCK 5153, zidebactam (WCK 5107), and WCK 4234. J. Med. Chem. 2018, 61, 4067−4086. (192) Khattab, M.; Wang, F.; Clayton, A. H. A. Conformational plasticity in TKI-kinase interactions revealed with fluorescence spectroscopy and theoretical calculations. J. Phys. Chem. B 2018, 122, 4667−4679. (193) Wu, B.; Wijma, H. J.; Song, L.; Rozeboom, H. J.; Poloni, C.; Tian, Y.; Arif, M. I.; Nuijens, T.; Quaedflieg, P. J. L. M.; Szymanski, W.; et al. Versatile peptide C-terminal functionalization via a computationally engineered peptide amidase. ACS Catal. 2016, 6, 5405−5414. (194) Gur, M.; Blackburn, E. A.; Ning, J.; Narayan, V.; Ball, K. L.; Walkinshaw, M. D.; Erman, B. Molecular dynamics simulations of site point mutations in the TPR domain of cyclophilin 40 identify conformational states with distinct dynamic and enzymatic properties. J. Chem. Phys. 2018, 148, 145101. (195) He, Y.; Maisuradze, G. G.; Yin, Y. P.; Kachlishvili, K.; Rackovsky, S.; Scheraga, H. A. Sequence-, structure-, and dynamicsbased comparisons of structurally homologous CheY-like proteins. Proc. Natl. Acad. Sci. U. S. A. 2017, 114, 1578−1583. (196) Scheraga, H. A.; Rackovsky, S. Homolog detection using global sequence properties suggests an alternate view of structural encoding in protein sequences. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 5225−5229. (197) Durbin, R.; Eddy, S. R.; Krogh, A.; Mitchison, G. Biological sequence analysis: Probabilistic models of proteins and nucleic acids; Cambridge University Press: Cambridge, U.K., 1998. (198) Yang, L. W.; Rader, A. J.; Liu, X.; Jursa, C. J.; Chen, S. C.; Karimi, H. A.; Bahar, I. oGNM: Online computation of structural dynamics using the Gaussian network model. Nucleic Acids Res. 2006, 34, W24−W31. (199) Best, R. B.; Hummer, G.; Eaton, W. A. Native contacts determine protein folding mechanisms in atomistic simulations. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 17874−17879. (200) Arnold, F. H. Design by directed evolution. Acc. Chem. Res. 1998, 31, 125−131. (201) Eijsink, V. G. H.; Gaseidnes, S.; Borchert, T. V.; van den Burg, B. Directed evolution of enzyme stability. Biomol. Eng. 2005, 22, 21−30. (202) Bommarius, A. S.; Broering, J. M. Established and novel tools to investigate biocatalyst stability. Biocatal. Biotransform. 2005, 23, 125− 139. (203) Reetz, M. T. Directed evolution of enzyme robustness. In Directed evolution of selective enzymes: Catalysts for organic chemistry and biotechnology; Wiley-VCH: Weinheim, 2016; pp 205−235. (204) Wijma, H. J.; Floor, R. J.; Janssen, D. B. Structure- and sequence-analysis inspired engineering of proteins for enhanced thermostability. Curr. Opin. Struct. Biol. 2013, 23, 588−594. (205) Bommarius, A. S.; Paye, M. F. Stabilizing biocatalysts. Chem. Soc. Rev. 2013, 42 (15), 6534−6565. (206) Russell, R. J.; Taylor, G. L. Engineering thermostability: lessons from thermophilic proteins. Curr. Opin. Biotechnol. 1995, 6, 370−374. (207) Querol, E.; Perez-Pons, J. A.; Mozo-Villarias, A. Analysis of protein conformational characteristics related to thermostability. Protein Eng., Des. Sel. 1996, 9, 265−271. AH

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(208) Jaenicke, R.; Bohm, G. The stability of proteins in extreme environments. Curr. Opin. Struct. Biol. 1998, 8, 738−748. (209) Ladenstein, R.; Antranikian, G. Proteins from hyperthermophiles: stability and enzymatic catalysis close to the boiling point of water. Adv. Biochem. Eng./Biotechnol. 1998, 61, 37−85. (210) Nestl, B. M.; Hauer, B. Engineering of flexible loops in enzymes. ACS Catal. 2014, 4, 3201−3211. (211) Sanchez-Ruiz, J. M. Protein kinetic stability. Biophys. Chem. 2010, 148, 1−15. (212) Pross, A.; Khodorkovsky, V. Extending the concept of kinetic stability: toward a paradigm for life. J. Phys. Org. Chem. 2004, 17, 312− 316. (213) Colón, W.; Church, J.; Sen, J.; Thibeault, J.; Trasatti, H.; Xia, K. Biological roles of protein kinetic stability. Biochemistry 2017, 56, 6179−6186. (214) Yamada, H.; Ueda, T.; Imoto, T. Thermodynamic and kinetic stabilities of hen-egg lysozyme and its chemically modified derivatives: analysis of the transition state of the protein unfolding. J. Biochem. 1993, 114, 398−403. (215) Broom, A.; Ma, S. M.; Xia, K.; Rafalia, H.; Trainor, K.; Colón, W.; Gosavi, S.; Meiering, E. M. Designed protein reveals structural determinants of extreme kinetic stability. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 14605−14610. (216) Li, J.; Wang, J.; Zhang, J.; Wang, W. Thermodynamic stability and kinetic foldability of a lattice protein model. J. Chem. Phys. 2004, 120, 6274−6287. (217) Motono, C.; Gromiha, M. M.; Kumar, S. Thermodynamic and kinetic determinants of Thermotoga maritima cold shock protein stability: A structural and dynamic analysis. Proteins: Struct., Funct., Genet. 2008, 71, 655−669. (218) Subbian, E.; Yabuta, Y.; Shinde, U. Positive selection dictates the choice between kinetic and thermodynamic protein folding and stability in subtilases. Biochemistry 2004, 43, 14348−14360. (219) Rodriguez-Larrea, D.; Minning, S.; Borchert, T. V.; SanchezRuiz, J. M. Role of solvation barriers in protein kinetic stability. J. Mol. Biol. 2006, 360, 715−724. (220) Solá, R. J.; Al-Azzam, W.; Griebenow, K. Engineering of protein thermodynamic, kinetic, and colloidal stability: Chemical glycosylation with monofunctionally activated glycans. Biotechnol. Bioeng. 2006, 94, 1072−1079. (221) Quezada, A. G.; Díaz-Salazar, A. J.; Cabrera, N.; PérezMontfort, R.; Piñeiro, Á .; Costas, M. Interplay between protein thermal flexibility and kinetic stability. Structure 2017, 25, 167−179. (222) Pleiss, J. Rational Design of Enzymes. In Enzyme catalysis in organic synthesis, 3rd ed.; Drauz, K., Gröger, H., May, O., Eds.; WileyVCH: Weinheim, 2012; pp 89−117. (223) Ema, T.; Nakano, Y.; Yoshida, D.; Kamata, S.; Sakai, T. Redesign of enzyme for improving catalytic activity and enantioselectivity toward poor substrates: manipulation of the transition state. Org. Biomol. Chem. 2012, 10, 6299−6308. (224) Steiner, K.; Schwab, H. Recent advances in rational approaches for enzyme engineering. Comput. Struct. Biotechnol. J. 2012, 2, No. e201209010. (225) Oshima, T. Stabilization of proteins by evolutionary molecular engineering techniques. Curr. Opin. Struct. Biol. 1994, 4, 623−628. (226) Eijsink, V. G. H.; Bjørk, A.; GÅseidnes, S.; SirevÅg, R.; Synstad, B.; van den Burg, B.; Vriend, G. Rational engineering of enzyme stability. J. Biotechnol. 2004, 113, 105−120. (227) Betz, S. F. Disulfide bonds and the stability of globular-proteins. Protein Sci. 1993, 2, 1551−1558. (228) Matsumura, M.; Matthews, B. W. Stabilization of functional proteins by introduction of multiple disulfide bonds. In Methods in Enzymology; Academic Press, 1991; pp 336−356. (229) Kanaya, S.; Katsuda, C.; Kimura, S.; Nakai, T.; Kitakuni, E.; Nakamura, H.; Katayanagi, K.; Morikawa, K.; Ikehara, M. Stabilization of Escherichia coli ribonuclease H by introduction of an artificial disulfide bond. J. Biol. Chem. 1991, 266, 6038−6044.

(230) Clarke, J.; Fersht, A. R. Engineered disulfide bonds as probes of the folding pathway of barnase: Increasing the stability of proteins against the rate of denaturation. Biochemistry 1993, 32, 4322−4329. (231) Kim, T.; Joo, J. C.; Yoo, Y. J. Hydrophobic interaction network analysis for thermostabilization of a mesophilic xylanase. J. Biotechnol. 2012, 161, 49−59. (232) Argos, P.; Rossmann, M. G.; Grau, U. M.; Zuber, H.; Frank, G.; Tratschin, J. D. Thermal stability and protein structure. Biochemistry 1979, 18, 5698−5703. (233) Estell, D. A.; Graycar, T. P.; Wells, J. A. Engineering an enzyme by site-directed mutagenesis to be resistant to chemical oxidation. J. Biol. Chem. 1985, 260, 6518−6521. (234) WHAT IF Web Interface; http://swift.cmbi.ru.nl/servers/ html/index.html (Accessed Dec 4, 2018). (235) Fei, H.; Xu, G.; Wu, J.-P.; Yang, L.-R. Improving the acetaldehyde tolerance of DERASEP by enhancing the rigidity of its protein structure. J. Mol. Catal. B: Enzym. 2015, 116, 148−152. (236) Kim, H. S.; Le, Q. A. T.; Kim, Y. H. Development of thermostable lipase B from Candida antarctica (CalB) through in silico design employing B-factor and RosettaDesign. Enzyme Microb. Technol. 2010, 47, 1−5. (237) Le, A. T.; Joo, J. C.; Yoo, Y. J.; Kim, Y. H. Development of thermostable Candida antarctica lipase B through novel in silico design of disulfide bridge. Biotechnol. Bioeng. 2012, 109, 867−876. (238) Tanghe, M.; Danneels, B.; Last, M.; Beerens, K.; Stals, I.; Desmet, T. Disulfide bridges as essential elements for the thermostability of lytic polysaccharide monooxygenase LPMO10C from Streptomyces coelicolor. Protein Eng., Des. Sel. 2017, 30, 401−408. (239) Duan, X.; Cheng, S.; Ai, Y.; Wu, J. Enhancing the thermostability of Serratia plymuthica sucrose isomerase using Bfactor-directed mutagenesis. PLoS One 2016, 11, No. e0149208. (240) Huang, J.; Xie, D.-F.; Feng, Y. Engineering thermostable (R)selective amine transaminase from Aspergillus terreus through in silico design employing B-factor and folding free energy calculations. Biochem. Biophys. Res. Commun. 2017, 483, 397−402. (241) Jones, B. J.; Lim, H. Y.; Huang, J.; Kazlauskas, R. J. Comparison of five protein engineering strategies to stabilize an α/β-hydrolase. Biochemistry 2017, 56, 6521−6532. (242) Chen, A.; Li, Y.; Nie, J.; McNeil, B.; Jeffrey, L.; Yang, Y.; Bai, Z. Protein engineering of Bacillus acidopullulyticus pullulanase for enhanced thermostability using in silico data driven rational design methods. Enzyme Microb. Technol. 2015, 78, 74−83. (243) Kheirollahi, A.; Khajeh, K.; Golestani, A. Rigidifying flexible sites: An approach to improve stability of chondroitinase ABC I. Int. J. Biol. Macromol. 2017, 97, 270−278. (244) Joo, J. C.; Pack, S. P.; Kim, Y. H.; Yoo, Y. J. Thermostabilization of Bacillus circulans xylanase: Computational optimization of unstable residues based on thermal fluctuation analysis. J. Biotechnol. 2011, 151, 56−65. (245) Li, C.; Li, J.; Wang, R.; Li, X.; Li, J.; Deng, C.; Wu, M. Substituting both the N-terminal and ″cord″ regions of a xylanase from Aspergillus oryzae to improve its temperature characteristics. Appl. Biochem. Biotechnol. 2018, 185, 1044−1059. (246) Amini-Bayat, Z.; Hosseinkhani, S.; Jafari, R.; Khajeh, K. Relationship between stability and flexibility in the most flexible region of Photinus pyralis luciferase. Biochim. Biophys. Acta, Proteins Proteomics 2012, 1824, 350−358. (247) Liao, H.; McKenzie, T.; Hageman, R. Isolation of a thermostable enzyme variant by cloning and selection in a thermophile. Proc. Natl. Acad. Sci. U. S. A. 1986, 83, 576−580. (248) Chen, K. Q.; Arnold, F. H. Tuning the activity of an enzyme for unusual environments-sequential random mutagenesis of subtilisin-E for catalysis in dimethylformamide. Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 5618−5622. (249) Miyazaki, K.; Arnold, F. H. Exploring nonnatural evolutionary pathways by saturation mutagenesis: rapid improvement of protein function. J. Mol. Evol. 1999, 49, 716−720. (250) Jackson, C. J.; Liu, J. W.; Carr, P. D.; Younus, F.; Coppin, C.; Meirelles, T.; Lethier, M.; Pandey, G.; Ollis, D. L.; Russell, R. J. AI

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Structure and function of an insect α-carboxylesterase (αEsterase7) associated with insecticide resistance. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 10177−10182. (251) Ishikawa, K.; Nakamura, H.; Morikawa, K.; Kanaya, S. Stabilization of Escherichia Coli ribonuclease HI by cavity-filling mutations within a hydrophobic core. Biochemistry 1993, 32, 6171− 6178. (252) Declerck, N.; Machius, M.; Joyet, P.; Wiegand, G.; Huber, R.; Gaillardin, C. Hyperthermostabilization of Bacillus licheniformis αamylase and modulation of its stability over a 50 °C temperature range. Protein Eng., Des. Sel. 2003, 16, 287−293. (253) Zhang, S. B.; Pei, X. Q.; Wu, Z. L. Multiple amino acid substitutions significantly improve the thermostability of feruloyl esterase A from Aspergillus niger. Bioresour. Technol. 2012, 117, 140− 147. (254) Yamada, R.; Higo, T.; Yoshikawa, C.; China, H.; Ogino, H. Improvement of the stability and activity of the BPO-A1 haloperoxidase from Streptomyces aureofaciens by directed evolution. J. Biotechnol. 2014, 192, 248−254. (255) Ruller, R.; Alponti, J.; Deliberto, L. A.; Zanphorlin, L. M.; Machado, C. B.; Ward, R. J. Concommitant adaptation of a GH11 xylanase by directed evolution to create an alkali-tolerant/thermophilic enzyme. Protein Eng., Des. Sel. 2014, 27, 255−262. (256) Qian, C. L.; Liu, N.; Yan, X.; Wang, Q.; Zhou, Z. H.; Wang, Q. F. Engineering a high-performance, metagenomic-derived novel xylanase with improved soluble protein yield and thermostability. Enzyme Microb. Technol. 2015, 70, 35−41. (257) Yang, M. J.; Lee, H. W.; Kim, H. Enhancement of thermostability of Bacillus subtilis endoglucanase by error-prone PCR and DNA shuffling. Appl. Biol. Chem. 2017, 60, 73−78. (258) He, D.; Luo, W.; Wang, Z. Y.; Lv, P. M.; Yuan, Z. H.; Huang, S. W.; Xv, J. L. Establishment and application of a modified membraneblot assay for Rhizomucor miehei lipases aimed at improving their methanol tolerance and thermostability. Enzyme Microb. Technol. 2017, 102, 35−40. (259) Liu, Y. H.; Liu, H.; Huang, L.; Gui, S.; Zheng, D.; Jia, L. B.; Fu, Y.; Lu, F. P. Improvement in thermostability of an alkaline lipase I from Penicillium cyclopium by directed evolution. RSC Adv. 2017, 7, 38538−38548. (260) Abdul Wahab, M. K. H.; Jonet, M. A. b.; Illias, R. M. Thermostability enhancement of xylanase Aspergillus fumigatus RT-1. J. Mol. Catal. B: Enzym. 2016, 134, 154−163. (261) Wagner, A. Robustness, evolvability, and neutrality. FEBS Lett. 2005, 579, 1772−1778. (262) Bloom, J. D.; Labthavikul, S. T.; Otey, C. R.; Arnold, F. H. Protein stability promotes evolvability. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 5869−5874. (263) O’Loughlin, T. L.; Patrick, W. M.; Matsumura, I. Natural history as a predictor of protein evolvability. Protein Eng., Des. Sel. 2006, 19, 439−442. (264) Lenski, R. E.; Barrick, J. E.; Ofria, C. Balancing robustness and evolvability. PLoS Biol. 2006, 4, No. e428. (265) Wagner, A. Robustness and evolvability: A paradox resolved. Proc. R. Soc. London, Ser. B 2008, 275, 91−100. (266) Tokuriki, N.; Stricher, F.; Serrano, L.; Tawfik, D. S. How protein stability and new functions trade off. PLoS Comput. Biol. 2008, 4, No. e1000002. (267) Bloom, J. D.; Arnold, F. H. In the light of directed evolution: Pathways of adaptive protein evolution. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 9995−10000. (268) Tokuriki, N.; Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 2009, 19, 596−604. (269) Draghi, J. A.; Parsons, T. L.; Wagner, G. P.; Plotkin, J. B. Mutational robustness can facilitate adaptation. Nature 2010, 463, 353−355. (270) Acevedo-Rocha, C. G.; Agudo, R.; Reetz, M. T. Directed evolution of stereoselective enzymes based on genetic selection as opposed to screening systems. J. Biotechnol. 2014, 191, 3−10.

(271) Reymond, J. L. Enzyme assays: high-throughput screening, genetic selection and fingerprinting; Wiley−VCH: Weinheim, 2006. (272) Ye, L.; Yang, C.; Yu, H. From molecular engineering to process engineering: development of high-throughput screening methods in enzyme directed evolution. Appl. Microbiol. Biotechnol. 2018, 102, 559− 567. (273) Bunzel, H. A.; Garrabou, X.; Pott, M.; Hilvert, D. Speeding up enzyme discovery and engineering with ultrahigh-throughput methods. Curr. Opin. Struct. Biol. 2018, 48, 149−156. (274) Reetz, M. T. Directed evolution of selective enzymes: Catalysts for organic chemistry and biotechnology; Wiley-VCH: Weinheim, 2016. (275) Bommarius, A. S. Biocatalysis: a status report. Annu. Rev. Chem. Biomol. Eng. 2015, 6, 319−345. (276) Currin, A.; Swainston, N.; Day, P. J.; Kell, D. B. Synthetic biology for the directed evolution of protein biocatalysts: Navigating sequence space intelligently. Chem. Soc. Rev. 2015, 44, 1172−1239. (277) Denard, C. A.; Ren, H. Q.; Zhao, H. M. Improving and repurposing biocatalysts via directed evolution. Curr. Opin. Chem. Biol. 2015, 25, 55−64. (278) In Directed evolution library creation, 2nd ed.; Gillam, E., Copp, J. N., Ackerley, D. F., Eds.; Springer-Verlag: New York, 2014. (279) Goldsmith, M.; Tawfik, D. S. Enzyme engineering by targeted libraries. In Methods in Protein Design; Keating, A. E., Ed.; Elsevier/ Academic Press, 2013; pp 257−283. (280) Brustad, E. M.; Arnold, F. H. Optimizing non-natural protein function with directed evolution. Curr. Opin. Chem. Biol. 2011, 15, 201−210. (281) Jäckel, C.; Hilvert, D. Biocatalysts by evolution. Curr. Opin. Biotechnol. 2010, 21, 753−759. (282) Turner, N. J. Directed evolution drives the next generation of biocatalysts. Nat. Chem. Biol. 2009, 5, 567−573. (283) Lutz, S.; Bornscheuer, U. T. Protein engineering handbook; Wiley-VCH: Weinheim, 2009. (284) Reetz, M. T.; Kahakeaw, D.; Sanchis, J. Shedding light on the efficacy of laboratory evolution based on iterative saturation mutagenesis. Mol. BioSyst. 2009, 5, 115−122. (285) Paramesvaran, J.; Hibbert, E. G.; Russell, A. J.; Dalby, P. A. Distributions of enzyme residues yielding mutants with improved substrate specificities from two different directed evolution strategies. Protein Eng., Des. Sel. 2009, 22, 401−411. (286) Reetz, M. T.; Prasad, S.; Carballeira, J. D.; Gumulya, Y.; Bocola, M. Iterative saturation mutagenesis accelerates laboratory evolution of enzyme stereoselectivity: Rigorous comparison with traditional methods. J. Am. Chem. Soc. 2010, 132, 9144−9152. (287) Reetz, M. T. Laboratory evolution of stereoselective enzymes: A prolific source of catalysts for asymmetric reactions. Angew. Chem., Int. Ed. 2011, 50, 138−174. (288) Reetz, M. T. Recent advances in directed evolution of stereoselective enzymes. In Directed enzyme evolution: Advances and applications; Alcalde, M., Ed.; Springer: Stuttgart, 2017; pp 69−99. (289) Reetz, M. T.; Wilensek, S.; Zha, D. X.; Jaeger, K. E. Directed evolution of an enantioselective enzyme through combinatorial multiple-cassette mutagenesis. Angew. Chem., Int. Ed. 2001, 40, 3589−3591. (290) Reetz, M. T. Biocatalysis in organic chemistry and biotechnology: Past, present, and future. J. Am. Chem. Soc. 2013, 135, 12480−12496. (291) Reetz, M. T. Strategies for applying gene mutagenesis methods. Directed evolution of selective enzymes: Catalysts for organic chemistry and biotechnology; Wiley-VCH: Weinheim, 2016; pp 115−165. (292) Evolution Tools CASTER and B-FITTER; http://www.kofo. mpg.de/en/research/biocatalysis (Accessed Dec 4, 2018). (293) Patrick, W. M.; Firth, A. E. Strategies and computational tools for improving randomized protein libraries. Biomol. Eng. 2005, 22, 105−112. (294) Denault, M.; N.Pelletier, J. In Protein engineering protocols; Arndt, K. M., Müller, K. M., Eds.; Humana Press: Totowa, 2007; pp 127−154. AJ

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

evolution of a thermostable enzyme. Angew. Chem., Int. Ed. 2009, 48, 8268−8272. (314) Reetz, M. T. The importance of additive and non-additive mutational effects in protein engineering. Angew. Chem., Int. Ed. 2013, 52, 2658−2666. (315) Bradbrook, G. M.; Gleichmann, T.; Harrop, S. J.; Habash, J.; Raftery, J.; Kalb, J.; Yariv, J.; Hillier, I. H.; Helliwell, J. R. X-Ray and molecular dynamics studies of concanavalin-A glucoside and mannoside complexes Relating structure to thermodynamics of binding. J. Chem. Soc., Faraday Trans. 1998, 94, 1603−1611. (316) Augustyniak, W.; Brzezinska, A. A.; Pijning, T.; Wienk, H.; Boelens, R.; Dijkstra, B. W.; Reetz, M. T. Biophysical characterization of mutants of Bacillus subtilis lipase evolved for thermostability: Factors contributing to increased activity retention. Protein Sci. 2012, 21, 487− 497. (317) Acharya, P.; Rajakumara, E.; Sankaranarayanan, R.; Rao, N. M. Structural basis of selection and thermostability of laboratory evolved Bacillus subtilis lipase. J. Mol. Biol. 2004, 341, 1271−1281. (318) Ahmad, S.; Rao, N. M. Thermally denatured state determines refolding in lipase: Mutational analysis. Protein Sci. 2009, 18, 1183− 1196. (319) Perchiacca, J. M.; Lee, C. C.; Tessier, P. M. Optimal charged mutations in the complementarity-determining regions that prevent domain antibody aggregation are dependent on the antibody scaffold. Protein Eng., Des. Sel. 2014, 27, 29−39. (320) Gumulya, Y.; Reetz, M. T. Enhancing the thermal robustness of an enzyme by directed evolution: Least favorable starting points and inferior mutants can map superior evolutionary Pathways. ChemBioChem 2011, 12, 2502−2510. (321) Reetz, M. T.; Wang, L.-W.; Bocola, M. Directed evolution of enantioselective enzymes: Iterative cycles of CASTing for probing protein-sequence space. Angew. Chem., Int. Ed. 2006, 45, 1236−1241 Erratum, 2494 . (322) Zou, J. Y.; Hallberg, B. M.; Bergfors, T.; Oesch, F.; Arand, M.; Mowbray, S. L.; Jones, T. A. Structure of Aspergillus niger epoxide hydrolase at 1.8 angstrom resolution: implications for the structure and function of the mammalian microsomal class of epoxide hydrolases. Structure 2000, 8, 111−122. (323) Structure of Aspergillus niger epoxide hydrolase. Protein Data Bank; http://www.rcsb.org/structure/1QO7 (Accessed Dec 4, 2018). (324) Wahler, D.; Reymond, J.-L. The adrenaline test for enzymes. Angew. Chem., Int. Ed. 2002, 41, 1229−1232. (325) Wahler, D.; Boujard, O.; Lefevre, F.; Reymond, J.-L. Adrenaline profiling of lipases and esterases with 1,2-diol and carbohydrate acetates. Tetrahedron 2004, 60, 703−710. (326) Cedrone, F.; Bhatnagar, T.; Baratti, J. C. Colorimetric assays for quantitative analysis and screening of epoxide hydrolase activity. Biotechnol. Lett. 2005, 27, 1921−1927. (327) Kahakeaw, D.; Reetz, M. T. A cell-based adrenaline assay for automated high-throughput activity screening of epoxide hydrolases. Chem. - Asian J. 2008, 3, 233−238. (328) Grinberg, A.; Bernhardt, R. Structural and functional consequences of substitutions at the Pro108-Arg14 hydrogen bond in bovine adrenodoxin. Biochem. Biophys. Res. Commun. 1998, 249, 933− 937. (329) Almog, O.; Gallagher, D. T.; Ladner, J. E.; Strausberg, S.; Alexander, P.; Bryan, P.; Gilliland, G. L. Structural basis of thermostability. Analysis of stabilizing mutations in subtilisin BPN. J. Biol. Chem. 2002, 277, 27553−27558. (330) Gumulya, Y. Doctoral Thesis, Ruhr-Universität Bochum, Germany, 2010. (331) Dellus-Gur, E.; Toth-Petroczy, A.; Elias, M.; Tawfik, D. S. What makes a protein fold amenable to functional innovation? Fold polarity and stability trade-offs. J. Mol. Biol. 2013, 425, 2609−2621. (332) Gumulya, Y.; Sanchis, J.; Reetz, M. T. Many pathways in laboratory evolution can lead to improved enzymes: How to escape from local minima. ChemBioChem 2012, 13, 1060−1066. (333) Eigen, M.; McCaskill, J.; Schuster, P. Molecular quasi-species. J. Phys. Chem. 1988, 92, 6881−6891.

(295) Li, A.; Acevedo-Rocha, C. G.; Sun, Z.; Cox, T.; Xu, J. L.; Reetz, M. T. Beating bias in the directed evolution of proteins: Combining high-fidelity on-chip solid-phase gene synthesis with efficient gene assembly for combinatorial library construction. ChemBioChem 2018, 19, 221−228. (296) Reetz, M. T. What are the limitations of enzymes in synthetic organic chemistry? Chem. Rec. 2016, 16, 2449−2459. (297) Clouthier, C. M.; Kayser, M. M.; Reetz, M. T. Designing new Baeyer-Villiger monooxygenases using restricted CASTing. J. Org. Chem. 2006, 71, 8431−8437. (298) Sun, Z.; Wikmark, Y.; Bäckvall, J.-E.; Reetz, M. T. New concepts for increasing the efficiency in directed evolution of stereoselective enzymes. Chem. - Eur. J. 2016, 22, 5046−5054. (299) Sun, Z.; Lonsdale, R.; Kong, X.-D.; Xu, J.-H.; Zhou, J.; Reetz, M. T. Reshaping an enzyme binding pocket for enhanced and inverted stereoselectivity: Use of smallest amino acid alphabets in directed evolution. Angew. Chem., Int. Ed. 2015, 54, 12410−12415. (300) Sun, Z.; Lonsdale, R.; Wu, L.; Li, G.; Li, A.; Wang, J.; Zhou, J.; Reetz, M. T. Structure-guided triple-code saturation mutagenesis: Efficient tuning of the stereoselectivity of an epoxide hydrolase. ACS Catal. 2016, 6, 1590−1597. (301) Sun, Z.; Lonsdale, R.; Ilie, A.; Li, G.; Zhou, J.; Reetz, M. T. Catalytic Asymmetric Reduction of Difficult-to-Reduce Ketones: Triple Code Saturation Mutagenesis of an Alcohol Dehydrogenase. ACS Catal. 2016, 6, 1598−1605. (302) Sun, Z.; Lonsdale, R.; Li, G. Y.; Reetz, M. T. Comparing different strategies in directed evolution of enzyme stereoselectivity: Single- versus double-code saturation mutagenesis. ChemBioChem 2016, 17, 1865−1872. (303) Li, G.; Zhang, H.; Sun, Z.; Liu, X.; Reetz, M. T. Multiparameter optimization in directed evolution: Engineering thermostability, enantioselectivity, and activity of an epoxide hydrolase. ACS Catal. 2016, 6, 3679−3687. (304) Wang, X.; Lin, H.; Zheng, Y.; Feng, J.; Yang, Z.; Tang, L. MDCAnalyzer-facilitated combinatorial strategy for improving the activity and stability of halohydrin dehalogenase from Agrobacterium radiobacter AD1. J. Biotechnol. 2015, 206, 1−7. (305) Li, X.-Q.; Wu, Q.; Hu, D.; Wang, R.; Liu, Y.; Wu, M.-C.; Li, J. F. Improving the temperature characteristics and catalytic efficiency of a mesophilic xylanase from Aspergillus oryzae, AoXyn11A, by iterative mutagenesis based on in silico design. AMB Express 2017, 7, 97−108. (306) Gao, S. H.; Zhu, S. Z.; Huang, R.; Li, H. X.; Wang, H.; Zheng, G. J. Engineering the enantioselectivity and thermostability of a (+)-gamma-lactamase from microbacterium hydrocarbonoxydans for kinetic resolution of vince lactam (2-Azabicyclo 2.2.1 hept-5-en-3-one). Appl. Environ. Microbiol. 2018, 84, e01780. (307) Parra, L. P.; Agudo, R.; Reetz, M. T. Directed evolution using iterative saturation mutagenesis based on multi-residue sites. ChemBioChem 2013, 14, 2301−2309. (308) Acevedo-Rocha, C. G.; Kille, S.; Reetz, M. T. Iterative saturation mutagenesis: A powerful approach to engineer proteins by simulating Darwinian evolution. In Directed evolution library creation: Methods and protocols, 2nd ed.; Ackerley, D., Copp, J., Gillam, E., Eds.; Humana Press: Totowa, 2014; pp 103−128. (309) Bommarius, A. S. Biocatalysis: A Status Report. Annu. Rev. Chem. Biomol. Eng. 2015, 6, 319−345. (310) Zhang, X. J.; Baase, W. A.; Shoichet, B. K.; Wilson, K. P.; Matthews, B. W. Enhancement of protein stability by the combination of point mutations in T4 lysozyme is additive. Protein Eng., Des. Sel. 1995, 8, 1017−1022. (311) Wunderlich, M.; Martin, A.; Staab, C. A.; Schmid, F. X. Evolutionary protein stabilization in comparison with computational design. J. Mol. Biol. 2005, 351, 1160−1168. (312) van Pouderoyen, G.; Eggert, T.; Jaeger, K. E.; Dijkstra, B. W. The crystal structure of Bacillus subtilis lipase: A minimal alpha/beta hydrolase fold enzyme. J. Mol. Biol. 2001, 309, 215−226. (313) Reetz, M. T.; Soni, P.; Acevedo, J. P.; Sanchis, J. Creation of an amino acid network of structurally coupled residues in the directed AK

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(334) Emren, L. O.; Kurtovic, S.; Runarsdottir, A.; Larsson, A. K.; Mannervik, B. Functionally diverging molecular quasi-species evolve by crossing two enzymes. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 10866− 10870. (335) Kurtovic, S.; Mannervik, B. Identification of emerging quasispecies in directed enzyme evolution. Biochemistry 2009, 48, 9330− 9339. (336) Mannervik, B.; Runarsdottir, A. The quest for molecular quasispecies in ligand-activity space and its application to directed enzyme evolution. FEBS Lett. 2010, 584, 2565−2571. (337) Runarsdottir, A.; Mannervik, B. A novel quasi-species of glutathione transferase with high activity towards naturally occurring isothiocyanates evolves from promiscuous low-activity variants. J. Mol. Biol. 2010, 401, 451−464. (338) Gupta, R. D.; Tawfik, D. S. Directed enzyme evolution via small and effective neutral drift libraries. Nat. Methods 2008, 5, 939−942. (339) Smith, W. S.; Hale, J. R.; Neylon, C. Applying neutral drift to the directed molecular evolution of a beta-glucuronidase into a betagalactosidase: Two different evolutionary pathways lead to the same variant. BMC Res. Notes 2011, 4, 138. (340) Steipe, B.; Schiller, B.; Pluckthun, A.; Steinbacher, S. Sequence statistics reliably predict stabilizing mutations in a protein domain. J. Mol. Biol. 1994, 240, 188−192. (341) Steipe, B. Consensus-based engineering of protein stability: From intrabodies to thermostable enzymes. Methods Enzymol. 2004, 388, 176−186. (342) Lehmann, M.; Loch, C.; Middendorf, A.; Studer, D.; Lassen, S. F.; Pasamontes, L.; van Loon, A.; Wyss, M. The consensus concept for thermostability engineering of proteins: Further proof of concept. Protein Eng., Des. Sel. 2002, 15, 403−411. (343) Polizzi, K. M.; Chaparroriggers, J. F.; Vazquezfigueroa, E.; Bommarius, A. S. Structure-guided consensus approach to create a more thermostable penicillin G acylase. Biotechnol. J. 2006, 1, 531−536. (344) Amin, N.; Liu, A. D.; Ramer, S.; Aehle, W.; Meijer, D.; Metin, M.; Wong, S.; Gualfetti, P.; Schellenberger, V. Construction of stabilized proteins by combinatorial consensus mutagenesis. Protein Eng., Des. Sel. 2004, 17, 787−793. (345) Jochens, H.; Aerts, D.; Bornscheuer, U. T. Thermostabilization of an esterase by alignment-guided focussed directed evolution. Protein Eng., Des. Sel. 2010, 23, 903−909. (346) Huang, P. S.; Ban, Y. E. A.; Richter, F.; Andre, I.; Vernon, R.; Schief, W. R.; Baker, D. RosettaRemodel: A generalized framework for flexible backbone protein design. PLoS One 2011, 6, No. e24109. (347) Kiss, G.; Celebi-Olcum, N.; Moretti, R.; Baker, D.; Houk, K. N. Computational enzyme design. Angew. Chem., Int. Ed. 2013, 52, 5700− 5725. (348) Dehouck, Y.; Kwasigroch, J. M.; Gilis, D.; Rooman, M. PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinf. 2011, 12, 151. (349) Zhang, S. B.; Wu, Z. L. Identification of amino acid residues responsible for increased thermostability of feruloyl esterase A from Aspergillus niger using the PoPMuSiC algorithm. Bioresour. Technol. 2011, 102, 2093−2096. (350) Damborsky, J.; Brezovsky, J. Computational tools for designing and engineering biocatalysts. Curr. Opin. Chem. Biol. 2009, 13, 26−34. (351) Parthiban, V.; Gromiha, M. M.; Schomburg, D. CUPSAT: Prediction of protein stability upon point mutations. Nucleic Acids Res. 2006, 34, W239−W242. (352) Chovancova, E.; Pavelka, A.; Benes, P.; Strnad, O.; Brezovsky, J.; Kozlikova, B.; Gora, A.; Sustr, V.; Klvana, M.; Medek, P. CAVER 3.0: A tool for the analysis of transport pathways in dynamic protein structures. PLoS Comput. Biol. 2012, 8, No. e1002708. (353) Yang, J.; Li, L. Z.; Xiao, Y. Z.; Li, J.; Long, L. J.; Wang, F. Z.; Zhang, S. Identification and thermoadaptation engineering of thermostability conferring residue of deep sea bacterial alpha-amylase AMY121. J. Mol. Catal. B: Enzym. 2016, 126, 56−63.

(354) Song, L.; Tsang, A.; Sylvestre, M. Engineering a thermostable fungal GH10 xylanase, importance of N-terminal amino acids. Biotechnol. Bioeng. 2015, 112, 1081−1091. (355) Wang, X.; Han, S.; Yang, Z.; Tang, L. Improvement of the thermostability and activity of halohydrin dehalogenase from Agrobacterium radiobacter AD1 by engineering C-terminal amino acids. J. Biotechnol. 2015, 212, 92−98. (356) Martin, A.; Sieber, V.; Schmid, F. X. In-vitro selection of highly stabilized protein variants with optimized surface. J. Mol. Biol. 2001, 309, 717−726. (357) Dumon, C.; Varvak, A.; Wall, M. A.; Flint, J. E.; Lewis, R. J.; Lakey, J. H.; Morland, C.; Luginbühl, P.; Healey, S.; Todaro, T.; et al. Engineering hyperthermostability into a GH11 xylanase is mediated by subtle changes to protein structure. J. Biol. Chem. 2008, 283, 22557− 22564. (358) Palackal, N.; Brennan, Y.; Callen, W. N.; Dupree, P.; Frey, G.; Goubet, F.; Hazlewood, G. P.; Healey, S.; Kang, Y. E.; Kretz, K. A.; et al. An evolutionary route to xylanase process fitness. Protein Sci. 2004, 13, 494−503. (359) Garrett, J. B.; Kretz, K. A.; O’Donoghue, E.; Kerovuo, J.; Kim, W.; Barton, N. R.; Hazlewood, G. P.; Short, J. M.; Robertson, D. E.; Gray, K. A. Enhancing the thermal tolerance and gastric performance of a microbial phytase for use as a phosphate-mobilizing monogastric-feed supplement. Appl. Environ. Microbiol. 2004, 70, 3041−3046. (360) Gray, K. A.; Richardson, T. H.; Kretz, K.; Short, J. M.; Bartnek, F.; Knowles, R.; Kan, L.; Swanson, P. E.; Robertson, D. E. Rapid evolution of reversible denaturation and elevated melting temperature in a microbial haloalkane dehalogenase. Adv. Synth. Catal. 2001, 343, 607−617. (361) Yin, X.; Li, J. F.; Wang, C. J.; Hu, D.; Wu, Q.; Gu, Y.; Wu, M. C. Improvement in the thermostability of a type A feruloyl esterase, AuFaeA, from Aspergillus usamii by iterative saturation mutagenesis. Appl. Microbiol. Biotechnol. 2015, 99, 10047−10056. (362) Abokitse, K.; Wu, M. Q.; Bergeron, H.; Grosse, S.; Lau, P. C. K. Thermostable feruloyl esterase for the bioproduction of ferulic acid from triticale bran. Appl. Microbiol. Biotechnol. 2010, 87, 195−203. (363) MODELLER Website; http://salilab.org.modeller (Accessed Dec 4, 2018). (364) Sanchis, J.; Fernández, L.; Carballeira, J. D.; Drone, J.; Gumulya, Y.; Höbenreich, H.; Kahakeaw, D.; Kille, S.; Lohmer, R.; Peyralans, J.P.; et al. Improved PCR method for the creation of saturation mutagenesis libraries in directed evolution: application to difficult-toamplify templates. Appl. Microbiol. Biotechnol. 2008, 81, 387−397. (365) Fernández, L.; Jiao, N.; Soni, P.; Gumulya, Y.; de Oliveira, L. G.; Reetz, M. T. An efficient method for mutant library creation in Pichia pastoris useful in directed evolution. Biocatal. Biotransform. 2010, 28, 122−129. (366) Larsen, D. M.; Nyffenegger, C.; Swiniarska, M. M.; Thygesen, A.; Strube, M. L.; Meyer, A. S.; Mikkelsen, J. D. Thermostability enhancement of an endo-1,4-beta-galactanase from Talaromyces stipitatus by site-directed mutagenesis. Appl. Microbiol. Biotechnol. 2015, 99, 4245−4253. (367) van der Meer, J. Y.; Biewenga, L.; Poelarends, G. J. The generation and exploitation of protein mutability landscapes for enzyme engineering. ChemBioChem 2016, 17, 1792−1799. (368) Acevedo-Rocha, C. G.; Ferla, M.; Reetz, M. T. Directed evolution of proteins based on mutational scanning. Methods Mol. Biol. 2018, 1685, 87−128. (369) Chokhawala, H. A.; Roche, C. M.; Kim, T. W.; Atreya, M. E.; Vegesna, N.; Dana, C. M.; Blanch, H. W.; Clark, D. S. Mutagenesis of Trichoderma reesei endoglucanase I: impact of expression host on activity and stability at elevated temperatures. BMC Biotechnol. 2015, 15, 11. (370) Boehlein, S. K.; Shaw, J. R.; Stewart, J. D.; Sullivan, B.; Hannah, L. C. Enhancing the heat stability and kinetic parameters of the maize endosperm ADP-glucose pyrophosphorylase using iterative saturation mutagenesis. Arch. Biochem. Biophys. 2015, 568, 28−37. AL

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

when applying triple-code saturation mutagenesis. ChemBioChem 2018, 19, 239−246. (391) Zhou, Z.; Li, M.; Xu, J.-H.; Zhang, Z.-J. A single mutation increases the activity and stability of Pectobacterium carotovorum nitrile reductase. ChemBioChem 2018, 19, 521−526. (392) Reetz, M. T.; Soni, P.; Fernández, L.; Gumulya, Y.; Carballeira, J. D. Increasing the stability of an enzyme toward hostile organic solvents by directed evolution based on iterative saturation mutagenesis using the B-FIT method. Chem. Commun. 2010, 46, 8657−8658. (393) Tian, K.; Tai, K.; Chua, B. J. W.; Li, Z. Directed evolution of Thermomyces lanuginosus lipase to enhance methanol tolerance for efficient production of biodiesel from waste grease. Bioresour. Technol. 2017, 245, 1491−1497. (394) Xue, J.; Grift, T. E.; Hansen, A. C. Effect of biodiesel on engine performances and emissions. Renewable Sustainable Energy Rev. 2011, 15, 1098−1116. (395) Canakci, M.; Sanli, H. Biodiesel production from various feedstocks and their effects on the fuel properties. J. Ind. Microbiol. Biotechnol. 2008, 35, 431−441. (396) Vahidi, A. K.; Yang, Y.; Ngo, T. P. N.; Li, Z. Simple and efficient immobilization of extracellular his-tagged enzyme directly from cell culture supernatant as active and recyclable nanobiocatalyst: Highperformance production of biodiesel from waste grease. ACS Catal. 2015, 5, 3157−3161. (397) Tian, K.; Li, Z. High-yielding, one-pot, and green production of biodiesel from waste grease using wet cells of a recombinant Escherichia coli strain as catalyst. Biochem. Eng. J. 2016, 115, 30−37. (398) Wehrmann, M.; Klebensberger, J. Engineering thermal stability and solvent tolerance of the soluble quinoprotein PedE from Pseudomonas putida KT2440 with a heterologous whole-cell screening approach. Microb. Biotechnol. 2018, 11, 399−408. (399) Kuipers, R. K.; Joosten, H. J.; van Berkel, W. J.; Leferink, N. G.; Rooijen, E.; Ittmann, E.; van Zimmeren, F.; Jochens, H.; Bornscheuer, U.; Vriend, G.; et al. 3DM: systematic analysis of heterogeneous superfamily data to discover protein functionalities. Proteins: Struct., Funct., Genet. 2010, 78, 2101−2113. (400) Yan, G.; Cheng, S.; Zhao, G.; Wu, S.; Liu, Y.; Sun, W. A single residual replacement improves the folding and stability of recombinant cassava hydroxynitrile lyase in E. coli. Biotechnol. Lett. 2003, 25, 1041− 1047. (401) Huang, J.; Jones, B.; Kazlauskas, R. J. Stabilization of an α/βhydrolase by introducing proline residues: Salicylic binding protein 2 from tobacco. Biochemistry 2015, 54, 4330−4341. (402) Zhang, X.-F.; Yang, G.-Y.; Zhang, Y.; Xie, Y.; Withers, S. G.; Feng, Y. A general and efficient strategy for generating the stable enzymes. Sci. Rep. 2016, 6, 33797−33798. (403) Li, G.; Maria-Solano, M. A.; Romero-Rivera, A.; Osuna, S.; Reetz, M. T. Inducing high activity of a thermophilic enzyme at ambient temperatures by directed evolution. Chem. Commun. 2017, 53, 9454− 9457. (404) Xie, Y.; An, J.; Yang, G.; Wu, G.; Zhang, Y.; Cui, L.; Feng, Y. Enhanced enzyme kinetic stability by increasing rigidity within the active site. J. Biol. Chem. 2014, 289, 7994−8006. (405) Engel, S.; Hock, H.; Bocola, M.; Keul, H.; Schwaneberg, U.; Moller, M. CaLB catalyzed conversion of epsilon-caprolactone in aqueous medium. Part 1: immobilization of CaLB to microgels. Polymers 2016, 8, 372. (406) Sen, S.; Puskas, J. E. Green polymer chemistry: Enzyme catalysis for polymer functionalization. Molecules 2015, 20, 9358−9379. (407) Gross, R. A.; Ganesh, M.; Lu, W. Enzyme-catalysis breathes new life into polyester condensation polymerizations. Trends Biotechnol. 2010, 28, 435−443. (408) Grochulski, P.; Li, Y.; Schrag, J. D.; Bouthillier, F.; Smith, P.; Harrison, D.; Rubin, B.; Cygler, M. Insights into interfacial activation from an open structure of Candida rugosa lipase. J. Biol. Chem. 1993, 268, 12843−12847. (409) Grochulski, P.; Li, Y.; Schrag, J. D.; Cygler, M. Two conformational states of Candida rugosa lipase. Protein Sci. 1994, 3, 82−91.

(371) Greene, T. W.; Hannah, L. C. Maize endosperm ADP-glucose pyrophosphorylase SHRUNKEN2 and BRITTLE2 subunit interactions. Plant Cell 1998, 10, 1295−1306. (372) Hannah, L. C.; Futch, B.; Bing, J.; Shaw, J. R.; Boehlein, S.; Stewart, J. D.; Beiriger, R.; Georgelis, N.; Greene, T. A shrunken-2 transgene increases maize yield by acting in maternal tissues to increase the frequency of seed development. Plant Cell 2012, 24, 2352−2363. (373) Jin, X. S.; Ballicora, M. A.; Preiss, J.; Geiger, J. H. Crystal structure of potato tuber ADP-glucose pyrophosphorylase. EMBO J. 2005, 24, 694−704. (374) Gratz, A.; Jose, J. Protein domain library generation by overlap extension (PDLGO): A tool for enzyme engineering. Anal. Biochem. 2008, 378, 171−176. (375) Acevedo, J. P.; Reetz, M. T.; Asenjo, J. A.; Parra, L. P. One-step combined focused epPCR and saturation mutagenesis for thermostability evolution of a new cold-active xylanase. Enzyme Microb. Technol. 2017, 100, 60−70. (376) Tokuriki, N.; Jackson, C. J.; Afriat-Jurnou, L.; Wyganowski, K. T.; Tang, R. M.; Tawfik, D. S. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 2012, 3, 1257. (377) Tawfik, D. S. Accuracy-rate tradeoffs: How do enzymes meet demands of selectivity and catalytic efficiency? Curr. Opin. Chem. Biol. 2014, 21, 73−80. (378) Miton, C. M.; Tokuriki, N. How mutational epistasis impairs predictability in protein evolution and design. Protein Sci. 2016, 25, 1260−1272. (379) Studer, R. A.; Christin, P. A.; Williams, M. A.; Orengo, C. A. Stability-activity tradeoffs constrain the adaptive evolution of RubisCO. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 2223−2228. (380) Bougioukou, J. D.; Kille, S.; Taglieber, A.; Reetz, M. Directed evolution of an enantioselective enoate-reductase: testing the utility of iterative saturation mutagenesis. Adv. Synth. Catal. 2009, 351, 3287− 3305. (381) Sun, Z.; Wu, L.; Bocola, M.; Chan, H.; Lonsdale, R.; Kong, X. D.; Yuan, S.; Zhou, J.; Reetz, M. T. Structural and computational insight into the catalytic mechanism of limonene epoxide hydrolase mutants in stereoselective transformations. J. Am. Chem. Soc. 2018, 140, 310−318. (382) Sun, Z.; Salas, P. T.; Siirola, E.; Lonsdale, R.; Reetz, M. T. Exploring productive sequence space in directed evolution using binary patterning versus conventional mutagenesis strategies. Bioresour. Bioprocess. 2016, 3, 44. (383) Arand, M.; Hallberg, B. M.; Zou, J.; Bergfors, T.; Oesch, F.; Werf, M. J. V. D.; Bont, J. A. M. D.; Jones, T. A.; Mowbray, S. L. Structure of Rhodococcus erythropolis limonene-1,2-epoxide hydrolase reveals a novel active site. EMBO J. 2003, 22, 2583−2592. (384) Wijma, H. J.; Floor, R. J.; Jekel, P. A.; Baker, D.; Marrink, S. J.; Janssen, D. B. Computationally designed libraries for rapid enzyme stabilization. Protein Eng., Des. Sel. 2014, 27, 49−58. (385) Floor, R. J.; Wijma, H. J.; Colpa, D. I.; Ramos-Silva, A.; Jekel, P. A.; Szymański, W.; Feringa, B. L.; Marrink, S. J.; Janssen, D. B. Computational library design for increasing haloalkane dehalogenase stability. ChemBioChem 2014, 15, 1660−1672. (386) Kellogg, E. H.; Leaver-Fay, A.; Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins: Struct., Funct., Genet. 2011, 79, 830−838. (387) Guerois, R.; Nielsen, J. E.; Serrano, L. Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol. 2002, 320, 369−387. (388) Buss, O.; Rudat, J.; Ochsenreither, K. FoldX as protein engineering tool: Better than random based approaches? Comput. Struct. Biotechnol. J. 2018, 16, 25−33. (389) Floor, R. J.; Wijma, H. J.; Jekel, P. A.; Terwisscha van Scheltinga, A. C.; Dijkstra, B. W.; Janssen, D. B. X-ray crystallographic validation of structure predictions used in computational design for protein stabilization. Proteins: Struct., Funct., Genet. 2015, 83, 940−951. (390) Qu, G.; Lonsdale, R.; Yao, P.; Li, G.; Liu, B.; Reetz, M. T.; Sun, Z. Methodology development in directed evolution: exploring options AM

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(410) de Brevern, A. G.; Bornot, A.; Craveur, P.; Etchebest, C.; Gelly, J. C. PredyFlexy: Flexibility and local structure prediction from sequence. Nucleic Acids Res. 2012, 40, W317−W322. (411) Zhang, J. H.; Lin, Y.; Sun, Y. F.; Ye, Y. R.; Zheng, S. P.; Han, S. Y. High-throughput screening of B factor saturation mutated Rhizomucor miehei lipase thermostability based on synthetic reaction. Enzyme Microb. Technol. 2012, 50, 325−330. (412) Hanapi, W. N. W.; Iuan-Sheau, C.; Mahadi, N. M.; Murad, A. M. A.; Bakar, F. D. A. Site-saturation mutagenesis of Glomerella cingulata cutinase gene for enhanced enzyme thermostability. AIP Conf. Proc. 2015, 1678, 030021. (413) Heath, R. S.; Birmingham, W. R.; Thompson, M. P.; Taglieber, A.; Daviet, L.; Turner, N. J. An engineered alcohol oxidase for the oxidation of primary alcohols. ChemBioChem 2018, DOI: 10.1002/ cbic.201800556. (414) Damnjanović, J.; Takahashi, R.; Suzuki, A.; Nakano, H.; Iwasaki, Y. Improving thermostability of phosphatidylinositol-synthesizing Streptomyces phospholipase D. Protein Eng., Des. Sel. 2012, 25, 415− 424. (415) Roth, T.; Beer, B.; Pick, A.; Sieber, V. Thermostabilization of the uronate dehydrogenase from Agrobacterium tumefaciens by semirational design. AMB Express 2017, 7, 103. (416) Reetz, M. T.; Soni, P.; Fernandez, L. Knowledge-guided laboratory evolution of protein thermolability. Biotechnol. Bioeng. 2009, 102, 1712−1717. (417) Cesarini, S.; Bofill, C.; Pastor, F. I. J.; Reetz, M. T.; Diaz, P. A thermostable variant of P. aeruginosa cold-adapted LipC obtained by rational design and saturation mutagenesis. Process Biochem. 2012, 47, 2064−2071. (418) Feller, G.; Gerday, C. Psychrophilic enzymes: Hot topics in cold adaptation. Nat. Rev. Microbiol. 2003, 1, 200−208.

AN

DOI: 10.1021/acs.chemrev.8b00290 Chem. Rev. XXXX, XXX, XXX−XXX