Review pubs.acs.org/CR
Transparent Window Vibrational Probes for the Characterization of Proteins With High Structural and Temporal Resolution Ramkrishna Adhikary, Jörg Zimmermann, and Floyd E. Romesberg* Department of Chemistry, The Scripps Research Institute, La Jolla, California 92037, United States ABSTRACT: Vibrational spectroscopy provides a direct route to the physicochemical characterization of molecules. While both IR and Raman spectroscopy have been used for decades to provide detailed characterizations of small molecules, similar studies with proteins are largely precluded due to spectral congestion. However, the vibrational spectra of proteins do include a “transparent window”, between ∼1800 and ∼2500 cm−1, and progress is now being made to develop site-specifically incorporated carbon− deuterium (C−D), cyano (CN), thiocyanate (SCN), and azide (N3) “transparent window vibrational probes” that absorb within this window and report on their environment to facilitate the characterization of proteins with small molecule-like detail. This Review opens with a brief discussion of the advantages and limitations of conventional vibrational spectroscopy and then discusses the strengths and weaknesses of the different transparent window vibrational probes, methods by which they may be site-specifically incorporated into peptides and proteins, and the physicochemical properties they may be used to study, including electrostatics, stability and folding, hydrogen bonding, protonation, solvation, dynamics, and interactions with inhibitors. The use of the probes to vibrationally image proteins and other biomolecules within cells is also discussed. We then present four case studies, focused on ketosteroid isomerase, the SH3 domain, dihydrofolate reductase, and cytochrome c, where the transparent window vibrational probes have already been used to elucidate important aspects of protein structure and function. The Review concludes by highlighting the current challenges and future potential of using transparent window vibrational probes to understand the evolution and function of proteins and other biomolecules.
CONTENTS 1. Introduction 2. Utility of Bond- and Site-Specific Vibrational Spectroscopy 3. Overview of Transparent Window Vibrational Probes 3.1. C−D Probes 3.2. CN and SCN Probes 3.3. N3 Probes 3.4. Other Transparent Window Vibrational Probes 4. Incorporation of Probes into Peptides and Proteins 4.1. Synthetic and Semisynthetic Incorporation 4.1.1. Solid-Phase Peptide Synthesis 4.1.2. Native Chemical Ligation 4.1.3. Expressed Protein Ligation 4.1.4. Noncovalent Association and Conformationally Assisted Ligation 4.2. Recombinant Expression 4.2.1. Chemically Defined Growth Medium 4.2.2. Amber Suppression 4.3. Post-translational Modification 5. Applications of Infrared Probes 5.1. Electrostatics 5.2. Stability and Folding © XXXX American Chemical Society
5.3. Structure and Conformational Heterogeneity 5.4. H-bonding, Protonation, and Solvation 5.5. Nonlinear IR Spectroscopy and Protein Dynamics 5.6. Protein−Inhibitor Interactions 5.7. Vibrational Imaging 6. Case Studies 6.1. Ketosteroid Isomerase 6.2. SH3 Domains 6.3. Dihydrofolate Reductase 6.4. Cytochrome c 7. Concluding Remarks Author Information Corresponding Author ORCID Notes Biographies Acknowledgments References
A B C C E F F F G G G H I I I J J K K M
O P U U W Y Y Z AB AD AF AG AG AG AG AG AG AG
Received: September 11, 2016
A
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
1. INTRODUCTION Chemical bonds between constituent atoms are the fundamental units of all molecules, and their nature largely determines the molecule’s static and dynamic properties. Thus, in a very real sense, the ultimate goal of any effort to understand a molecule is to understand the detailed nature of its constituent bonds. Moreover, the nature of an individual bond is sensitive to its local environment, and thus its characterization provides a window into that environment. The most direct approach to the study of bonds within molecules is vibrational spectroscopy, which reports directly on the bonds themselves. Moreover, molecules are not static and often interconvert between different states on a wide range of timescales, and the inherently fast timescale of vibrational spectroscopy allows for the resolution and characterization of even the most rapidly interconverting states. Indeed, both infrared (IR) and Raman vibrational spectroscopies have been used for decades with small molecules to generate a detailed understanding of molecular and electronic structure, solvation, dynamics, and interactions with other molecules.1 These small molecule studies are possible because the limited number of bonds within a small molecule gives rise to well-dispersed spectral features, which may thus be individually observed and characterized. While the characterization of proteins is of particular interest, because they are the products of evolution and the mediators of biological activity, they are generally not amenable to traditional vibrational spectroscopy due to their large number of similar bonds that give rise to overlapping spectral features and which thus preclude the observation or characterization of any single vibration. Protein IR and Raman spectra, however, do possess a spectral region, between ∼1800 and ∼2500 cm−1, that is free of native signals, which we have referred to as the “transparent window”2,3 (Figure 1). This has led to interest in developing
acid residues and are extrinsic probes. After a brief review of the advantages and limitations of conventional vibrational spectroscopy, this Review will discuss the transparent-window vibrational probes that enable the application of the technique to the study of proteins, the methods used to incorporate them into peptides and proteins, their utility for the characterization of various properties of peptides and proteins, and, finally, case studies of specific proteins for which interesting stories have emerged. The majority of the Review focuses on the use of IR spectroscopy, according to the majority of the studies that have been reported to date; however, there are an increasing number of reports using Raman spectroscopy, and these studies are included where appropriate.
2. UTILITY OF BOND- AND SITE-SPECIFIC VIBRATIONAL SPECTROSCOPY Vibrational spectroscopy is an invaluable tool for the characterization of a molecule because the frequency, line shape, intensity, and number of vibrational absorptions of a given bond can be related to its local structure, environment(s), and dynamics. The frequency of a single stretch vibration, while mainly determined by the nature of the bond, also depends on the local electric field and specific noncovalent interactions such as hydrogen bonding (H-bonding). Therefore, the vibrational frequency of a given bond is different in different solvent environments (Figure 2), an effect referred to as solvatochromism.4,5 Solvatochromic frequency shifts are usually small (on the scale of a few wavenumbers), but well within the resolution of IR or Raman spectrometers, and are thus easily detected. The line shape of a vibrational absorption band is determined by homogeneous and inhomogeneous contributions, resulting from relaxation and static inhomogeneity (local heterogeneity) of the bond’s microenvironment, respectively. To a good approximation, the line shape may be approximated using the so-called pseudo-Voigt function, which is the sum of a Lorentzian function, representing homogeneous broadening, and a Gaussian function, representing inhomogeneous broadening: A( v ̅ ) =
σ m 1 − m (v ̅ − v0̅ )2 /2σ + e 2 2 π ( v ̅ − v0̅ ) + σ 2σ
(1)
where v0̅ is the center frequency, σ2 is the variance of the Gaussian contribution, and m is a parameter describing the degree of Lorentzian character of the absorption band. If m = 0, the absorption band is purely Gaussian, i.e., line broadening is dominated by static inhomogeneity; if m = 1, the absorption band is purely Lorentzian, i.e., homogeneous line broadening determines the line shape. The presence of multiple absorptions resulting from a single bond is evidence that the bond experiences multiple, distinct environments and is thus evidence that the molecule simultaneously populates multiple, distinct states. Correspondingly, the frequencies and line widths of the individual absorptions provide information about the nature of the individual states. While the intensity of a vibrational absorption is dependent on the polarity (for absorption spectroscopy) or polarizability (for Raman spectroscopy) of the bond, these factors typically depend only minimally on the environment, and thus when multiple absorptions are observed for a single bond, the intensities are at least roughly proportional to the relative population of the corresponding states (however, there are notable exceptions, especially if H-bonding is involved,
Figure 1. Typical protein IR absorption spectrum indicating the position of the “transparent window”.
“transparent window vibrational probes”i.e., any protein modification that provides signals within this spectral region. To date, the probes most extensively examined, and which are thus the primary focus of this Review, include carbon− deuterium (C−D) bonds, which when used to replace C−H bonds may be considered intrinsic probes as they are not expected to alter the potential energy surface or the properties of the protein, as well as cyano (CN), thiocyanate (SCN), and azide (N3) moieties, which may be attached to individual amino B
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
vibrations are at least in part normal modes that are delocalized over multiple residues. Protein side chains have characteristic absorptions between ∼1000 cm−1 (e.g., the C−C stretches of Pro, Trp, or Ser) and ∼1800 cm−1 (e.g., CO stretches in Asp, Glu, and Asn).10 The only characteristic side-chain absorptions above 1800 cm−1 are the S−H stretches of Cys, which occur around 2550 cm−1.10,11 Thus, the majority of side-chain and backbone vibrations fall into the same crowded region of the IR or Raman spectrum. Isotope labeling may be used to at least partially alleviate these challenges. For instance, site-specific labeling of backbone carbonyls (e.g., 13C18O) shifts the given CO vibrations by up to 100 cm−1 into a less crowded spectral area at the onset of the amide I band. Fourier transform infrared (FTIR) difference spectroscopy can then be used to detect the vibration, but this usually requires the subtraction of two large numbers and is thus inherently experimentally challenging. Nonetheless, several groups have successfully used this approach to study protein structure and dynamics.12−18 However, the use of transparent window vibrational probes better circumvents these issues by allowing for the largely background-free characterization of vibrations at specific sites within a protein.
3. OVERVIEW OF TRANSPARENT WINDOW VIBRATIONAL PROBES Every probe, vibrational or otherwise, has its strengths and weaknesses. For the transparent window vibrational probes, there are several important aspects to consider. (i) The probe must provide a suitable absorption in the transparent window. Even the smallest proteins have many overlapping absorptions, and the probe must absorb in the transparent region of the protein vibrational spectrum (∼1800−2500 cm−1) to permit largely background-free characterization (while features associated with water transitions are present, these are straightforward to subtract due to their broad nature). Moreover, the probe absorption should be suitably strong to allow for the characterization of peptides and proteins at micro- to lower millimolar concentrations. (ii) The probe must be stable. Once incorporated into the protein, the probe must be stable under conditions of biological interest, for example, at elevated temperatures if thermophilic proteins or thermal unfolding are of interest. (iii) The spectral features must be sensitive to their environments. The IR or Raman signal must be sensitive to the structure and/or electrostatics of the native protein, and not dominated by non-native interactions introduced by the probe itself. (iv) The probe must be nonperturbative. The probe must not significantly perturb the structure or stability of the protein. Even if the structure is not perturbed, significant destabilization may be problematic, because changes in stability may cause changes in dynamics as the destabilized protein explores more or different regions of its potential energy surface, which in turn may result in changes in the protein’s properties of interest. The extent to which the C−D, CN, SCN, and N3 probes generally satisfy these four requirements is discussed in detail below, and the amino acid modifications through which they are commonly introduced into peptides and proteins, and which are referred to throughout this Review, are shown in Figure 3.
Figure 2. Solvatochromic IR frequency shifts of representative transparent window vibrational probes. (A) Symmetric stretch of Boc-protected (d3)Met in mixtures of water and isopropanol, as well as in methanol, ethanol, t-butanol, benzyl alcohol, tetrahydrofuran, pdioxane, and toluene. (B) p-Tolunitrile and (C) MeSCN in different solvents with various polarities and H-bonding abilities, including cyclohexane, toluene, isopropyl alcohol, dimethylsulfoxide (DMSO), water, and formamide. Panels A−C are reproduced (with modification) with permission from refs 6 and 7. Copyright 2008 and 2015 American Chemical Society, respectively.
where the environment can significantly alter the bond dipole and thus the absorption intensity.8 The detection and quantification of multiple states is a particular advantage of vibrational spectroscopy over other techniques that have intrinsically slower timescales as it allows for the resolution of even the fastest interconverting species. Because proteins are composed of a limited range of atoms (C, H, N, O, and S) and types of bonding, a major challenge in their characterization by vibrational spectroscopy is limited spectral dispersion (Figure 1). For example, protein backbone vibrations give rise to several characteristic absorption bands below 1700 cm−1 (amide I to VII) and around 3500 cm−1 (amide A and B). The amide I band, which occurs between 1600 and 1700 cm−1, arises mainly from a combination of C O and C−N stretch vibrations, is sensitive to the backbone conformation and H-bonding, and has thus been used extensively to analyze protein conformation. Typically, the amide I band is deconvoluted into components that are then assigned to secondary structure motifs (β-sheet, α-helix, and random coil).9 The resulting structural information is thus averaged over the entire protein, and in fact, the underlying
3.1. C−D Probes
C−D bonds were the first transparent window probes incorporated into proteins,2,19 and they may be used to replace C−H bonds at either side-chain or backbone locations. C
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 3. Chemical structure of C−D, (S)CN, and N3 transparent window vibrational probes discussed in this Review.
with a molar extinction coefficient of ∼100 M−1·cm−1. This band has been assigned as the overlapping asymmetric stretch vibrations of the two intensity-enhanced CδD3 methyl stretches, making it a strong enough IR chromophore to be used in nonlinear IR spectroscopy.21,34 While the presence of coupling makes absorptions more difficult to interpret, it may also increase their information content, because the nature and magnitude of the coupling is itself of great interest. For example, selective coupling between vibrations has been shown to underlie specific pathways of energy flow in small molecules.34−36 The absorption frequency of a C−D bond is highly sensitive to its local environment. For example, shifts of up to 9 cm−1 have been observed in different solvents (Figure 2A),6 shifts of up to 19 cm−1 have been observed in different proteins, shifts of up to 16 cm−1 have been observed at different positions of the same protein, and shifts of up to 9 cm−1 have been observed for the same site in different states of a protein.20,25 In general, the physical origins of such shifts depend on the nature of the carbon center. C−D bonds that are isolated from heteroatoms are predominantly sensitive to their local electrostatic environment. This is due to selective stabilization of C−D charge density in the ground or excited vibrational state, as well as the absence of other contributions. This would appear to result from dominant dipolar interactions as the absorption frequencies of other similar heavy−light atom vibrations (O− H and O−D) have been shown to be linearly proportional to solvent electric field,37,38 although this has not been explicitly
Substitution of aliphatic C−H bonds, such as in the side chains of Leu, Val, etc. provides stretching vibrations that absorb between 2050 and 2260 cm−1.20,21 Substitutions of aromatic C−H bonds, such as in the side chains of Phe and Trp, provide vibrations that absorb around 2250−2350 cm−1.22,23 Finally, substitution at backbone Cα−H bonds results in vibrations that absorb around 2100−2300 cm−1.24−32 All such substitutions are stable, no new interactions are introduced, and no existing interactions are perturbed (at least within the Born− Oppenheimer approximation, and ignoring small changes in bond length). The fact that C−D bonds may be introduced at essentially any location within a protein with virtually no concern of perturbation makes them the most versatile of the transparent window probes. In addition, the natural dispersion of the absorption frequencies of aliphatic, aromatic, and Cα C− D bonds within the transparent window suggests that multiple C−D bonds can be incorporated and characterized simultaneously at different sites within a protein, although this has yet to be taken advantage of experimentally. The major challenge associated with the use of C−D probes is their relatively low signal strength. The molar extinction coefficient of a single C−D bond is often 5 MV·cm−1.245 In a recent report, Slocum and Webb compared changes in the fluorescence maximum of GFP, which is known to depend on the local electric field, with changes in the absorptions of nearby (CN)Phe probes.219 The contribution of H-bonding was included via the FTLS method, and a
5.2. Stability and Folding
To perform their diverse functions, proteins have evolved to adopt folded states with specific secondary and tertiary structures (Figure 12). Many small proteins, as well as
Figure 12. Potential energy landscape as a function of protein folding, showing potential energy minima corresponding to folding intermediates and those accessible to the folded state, with more minima becoming accessible with destabilization.
individual domains of larger proteins, fold spontaneously, revealing that all of the information required is encoded in the sequence of amino acids. Moreover, chaperone proteins have evolved to facilitate the folding of proteins within a cell. Nonetheless, protein misfolding underlies many pathologies, such as Alzheimer’s, Parkinson’s, and Huntington’s diseases,246 and the forces underlying the folding process remain poorly understood. In particular, it has remained unclear if the factors responsible are localized, for example, to individual secondary elements such as β-sheets or α-helices, or dispersed over the entire protein. These forces are important not only for proper folding but also for the evolution of a new function, as most adaptive mutations are destabilizing and compensatory mutations that restore stability are likely required.247−251 The characterization of protein folding under equilibrium conditions252 provides a measure of protein stability and also allows for the identification of thermodynamic traps that might result in the formation of natural or pathological (mis)folding intermediates. Studies of the equilibrium (un)folding process (note that, under equilibrium conditions, the folding and unfolding mechanisms are identical, and we will generally refer to the M
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
parameters (midpoints, ΔG°(0), and m for denaturant-induced unfolding and ΔG°, ΔH°, and ΔS° for thermal unfolding) determined at different sites is evidence of a single, cooperative unfolding event, and differences are evidence of stepwise unfolding. We have shown that cyt c folds via a stepwise mechanism in urea and via a more concerted transition in GdnHCl20,188 (see section 6.4). We have also examined the thermal unfolding of the N-terminal SH3 domain of the crk-II protein (nSH3) and shown that it occurs via a single, cooperative process25 (Figure 13A). However, when the same unfolding was examined with
process as “folding”) rely on experimental observables that are sensitive to secondary or tertiary structure, such as circular dichroism or the absorption or emission of native or non-native chromophores (tryptophan fluorescence, heme absorption or fluorescence, amide I and amide II absorption, etc.).253 Interestingly, it is commonly found that, during folding transitions induced by chemical denaturants or changes in temperature, the signal at any point along the transition can be described as a superposition of subpopulations of folded and unfolded protein.254 This suggests that the equilibrium transition between the folded and unfolded states of many proteins is highly cooperative and, even in the transition region, many proteins exist either in their folded or unfolded forms, as well as that folding intermediates are rarely populated. However, most studies have focused on small proteins (or even peptides), which may not be representative of all proteins, and any conclusions need to be considered in the context of the experimental observation that is used to characterize the folding transition. For example, circular dichroism or amide-absorption data are averaged over the entire protein and thus may not be sensitive to smaller-scale, local conformational changes. Conversely, native local probes such as tryptophan fluorescence only monitor their specific local environment. In principle, transparent window vibrational probes site-specifically incorporated at different positions throughout a protein may bridge these two limiting cases. Equilibrium (un)folding is commonly induced via the addition (dilution) of chaotropic denaturants, such as guanidine hydrochloride (GdnHCl) or urea. The most commonly used model for denaturant-induced unfolding is the so-called linear extrapolation method (LEM).255 Because denaturant-induced unfolding transitions are usually sigmoidal with denaturant concentration, a good approximation, at least for the transition region, is that the free energy of unfolding is linearly dependent on denaturant concentration, ΔG F°→ U(c) = ΔG F°→ U(0) − m·c
(8)
where ΔG°F→U(c) is the denaturant concentration-dependent free energy of unfolding, ΔG°F→U(0) is the free energy of unfolding at zero denaturant concentration, and m is the sensitivity of the unfolding event to the concentration of added denaturant (commonly thought to be related to the change of accessible surface area upon unfolding).256 The model makes it possible to extrapolate the free energy of unfolding from the transition region back to zero denaturant concentration, i.e., the free energy of unfolding of the protein under unperturbed conditions. Equilibrium (un)folding may also be induced thermally. The data is typically analyzed using a heat capacity corrected van’t Hoff equation,
Figure 13. Overlay of the thermally induced (un)folding titration curves at different labeled residues of nSH3 fit to two-state transitions. (A) C−D probes. (B) (S)CN and N3 probes. See text for details. Reproduced with permission from ref 63. Copyright 2014 Wiley.
(S)CN or N3 probes, it appeared decidedly heterogeneous due to site-specific perturbations (which artificially suggested that different parts of the protein (un)fold at different temperatures; Figure 13B).63 Transparent window vibrational probes may also be used in time-resolved (kinetic) studies of protein folding. For example, under mildly denaturing conditions, CO-bound cyt c is unfolded, but laser-induced photodissociation of the CO from the iron center triggers folding, and we have followed the process with both CN and C−D probes.191,192 Both probes showed transient signals that were well-correlated with the much stronger and easily observable CO transient signal, thus proving the utility of such probes in time-resolved experiments. Dyer and co-workers introduced Aha into NTL9 via recombinant expression in an auxotrophic strain of E. coli.258 The observed refolding kinetics after a laser-induced temperature jump revealed that the Aha probe refolds on a slower timescale than the peptide backbone, clearly indicating a stepwise kinetic mechanism. A particularly exciting future direction of transparent window vibrational probes is their use to characterize specific protein motions during not only folding
ΔG F°→ U(c) = ΔHF°→ U − T ΔSF°→ U + ΔCp (T − Tm − T ln(T /Tm))
(9)
where ΔHF−U, ΔSF−U, and ΔCp are the change in enthalpy, entropy, and heat capacity upon unfolding, respectively, T is the temperature, and Tm the temperature midpoint for (un)folding. However, the heat capacity term is often neglected when fitting the transition region, because it is thought to be small for temperatures near the transition midpoint.257 We have used site-specifically incorporated transparent window vibrational probes to study the equilibrium (un)folding of different proteins. In general, the coincidence of unfolding N
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 14. IR absorption spectra of (CN)Phe-labeled cyt p450cam variants for the free state, CO complex, camphor complex, and camphor/CO complex. Shaded regions correspond to the Gaussian functions required for adequate fitting. Reproduced with permission from ref 220. Copyright 2016 American Chemical Society.
Discrete structural transitions involving fluctuations between two or more unique states results in multiple, discrete IR absorptions, whereas increased (decreased) fluctuations about a single average protein structure gives rise to absorptions with broader (narrower) widths, where the relative intensities of the absorptions reflect the relative population of the interconverting states. A classic example is the conformational heterogeneity of the active site of myoglobin, which was first demonstrated by the observation of three distinct CO ligand absorptions.84 The peptide backbone of a protein can undergo a range of fluctuations, primarily about the torsion angles around the N− Cα and Cα−C bonds (referred to as ϕ and ψ, respectively),269 and Mirkin and Krimm predicted that the Cα−D bond of Ala is sensitive to its local conformation.29−32 For example, relative to a β-sheet conformation (ϕ = −134°, ψ = 145°), the predicted frequency of the Cα−D stretch is shifted 8 cm−1 for a righthanded helix conformation (ϕ = −60°, ψ = 40°) and 9 cm−1 for a polyproline II conformation (ϕ = −75°, ψ = 145°).29 Corcelli and co-workers predicted that the Cα−D stretch frequencies of model Ala and Gly dipeptides are remarkably sensitive to backbone conformations, shifting up to ∼40 cm−1.28 Interestingly, in the case of the Gly dipeptide, the splitting between the symmetric and asymmetric CαD2 absorptions was also exceptionally sensitive to its conformation, varying as much as from 40.7 to 81.3 cm−1, respectively, likely due to conformation-dependent hyperconjugation between the amide electron density and the C−D σ* orbital. The observation of more absorptions than probe bonds is strong evidence of conformational heterogeneity. We have observed this at several positions within cyt c, DHFR, and nSH3. For example, the single Cα−D bond of (d1)Leu159 in nSH3 showed two absorptions.25 In addition, C−D bonds incorporated throughout the D-loop and 70s helix of oxidized cyt c revealed multiple absorptions corresponding to the
but also other processes such as molecular recognition or catalysis. 5.3. Structure and Conformational Heterogeneity
While the folded states of proteins have traditionally been thought of as being well-defined, it is now understood that they consist of an ensemble of different conformations and conformational substates, and this structural heterogeneity may be important for function (Figure 12).259−262 While transitions between states separated by relatively large barriers, typically occurring on the μs or longer timescale, can clearly contribute to function, faster motions are also likely to be important as they facilitate the induced-fit mode of molecular recognition263,264 and also make important contributions to the entropy of binding.265 These motions are difficult or impossible to characterize by NMR spectroscopy because of the interconversion-mediated signal coalescence.266 In contrast, conformations that interconvert on even the fastest timescales may be distinguished by vibrational spectroscopy because of the method’s inherently high time resolution. For example, a vibration of ∼2200 cm−1 has a transition time of ∼15 fs, which is much faster than the timescale of even the most rapidly interconverting protein species.267 Furthermore, two different conformations giving rise to two IR absorption bands separated by 5 cm−1 would have to interconvert on the picosecond timescale to not be resolvable because of signal coalescence.268 While fluorescence probes in principle may be used to detect rapidly interconverting species, they are often disposed into solvent, where they are less sensitive to changes within the protein, and if not, they are likely to be perturbative due to their size. As discussed above, IR (and Raman) spectroscopy provides high temporal resolution, and the transparent window vibrational probes provide high structural resolution and are likely to be less perturbative even when incorporated within the interior of a protein; thus, they are ideal probes for the characterization of structure and conformational heterogeneity. O
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
which was thought to participate in such an H-bond with the carbonyl group of Ile7626,272 (Figure 15). In a lipid bilayer, the
simultaneous population of folded and unfolded states, which revealed an oxidation-dependent partial unfolding that had previously been incorrectly attributed to increased dynamics of the folded state20 and which was later confirmed by others using NMR relaxation experiments.270 In the DHFR complex with folate and NADP+, deuteration of the active site Tyr ((d4)Tyr100) resulted in two absorption bands that were similar to those of the more distal (d4)Tyr111, plus an additional high-frequency band that was assigned to a unique conformation.22 Recently, Thielges and co-workers observed doubled C−D absorptions of a deuterated PXXP peptide when it was bound to its cognate SH3 domain.271 Hochstrasser and co-workers detected the presence of conformational heterogeneity in the (CN)Phe-labeled Cterminal subdomain of the 35-aa HP35.127 When (CN)Phe was incorporated at position 58, which is positioned within one of three short helices that form a compact hydrophobic core, the linear FTIR absorption spectrum showed an absorption band centered at 2233.7 cm−1 with a line width of 13.6 cm−1. The absorption band was slightly asymmetric, and data fitting required two Voigt profiles with peaks separated by 5 cm−1. Further characterization by 2D IR spectroscopy revealed the presence of two absorption bands at 2228.7 ± 1.3 and 2234.5 ± 0.7 cm−1, which showed different dynamics and, based on model studies, were assigned to species where the probe was positioned in a hydrophobic or more hydrophilic environment with populations of 0.44 and 0.56, respectively. However, only single absorption was observed when the same probe was incorporated at the same position by Fayer and co-workers,128,129 leaving the potential heterogeneity at this site of HP35 ambiguous. Boxer, Herschlag, and co-workers observed that at low temperatures the absorption of an SCN probe incorporated at position 116 in KSI split into two absorptions, which were assigned to two different protonation states of a nearby network of Tyr residues243 (see section 6.1). Recently, Thielges and coworkers site-specifically incorporated (CN)Phe at different locations throughout cyt P450cam and characterized its local environment in the free state, in the complex with its substrate camphor, and in the complex with substrate and ligand (CO), to study the different states populated by the enzyme during its catalytic cycle.220 The IR spectra of (CN)Phe incorporated at active site-proximal positions 87 or 96 showed two significant absorptions in both the free and CO-bound states (Figure 14). In both cases, the low-frequency absorption was assigned to a free CN species and the high-frequency absorption was assigned to a H-bonded species. Interestingly, despite the absence of potential H-bond donors, fitting of the IR spectra of (CN)Phe98 and (CN)Phe201, which are more distal to the active site, each required two Gaussians, suggesting the presence of conformational heterogeneity in the protein itself. (CN)Phe305 is distal from the binding site and showed little change in the different states of the enzyme.
Figure 15. Representative Cα−H···O H-bonds thought to contribute to GPA dimerization (PDB ID 1AFO). Reproduced with permission from ref 273. Copyright 2001 National Academy of Sciences.
wild-type peptide showed an asymmetric CαD2 stretching absorption that was 6 cm−1 shifted relative to that of the mutant G83I GPA peptide, which is not capable of dimerization. The red-shift was attributed to a Cα−D···O H-bond, and its 6 cm−1 magnitude was associated with an H-bond strength of 0.88 kcal/mol. While C−D···O H-bonding is generally expected to cause a blue-shift,274 and blue-shifts have been presented by Krimm and Mirkin as evidence of Cα−D···O H-bonding in model dipeptides,275,276 the opposite may have been observed with the GPA peptide due to coupling or the detailed nature of the H-bond (perhaps similar to shifts induced by H-bonding with (S)CN probes5). Controlling the pKa of an amino acid side chain is critical in many biological processes, such as acid−base and nucleophilic catalysis, and C−D bonds again provide nonperturbative probes to study the process. It has been known for decades that the IR absorptions of C−H/D bonds are sensitive to the state of protonation of adjacent heteroatoms due to associated changes in hyperconjugation.277−281 This phenomenon has been most intensively characterized with amines, for which negative hyperconjugation between the nitrogen lone pair of electrons and the C−H/D σ* orbital induces a significant redshift (the shifted bands are commonly referred to as “Bohlmann bands”) and an increase in intensity due to increased electron density and polarity of the bond.43,282,283 (Figure 4). Correspondingly, H-bonding, protonation, or solvation of the heteroatom, which compete for the lone pair, may reduce hyperconjugation and induce a blue-shift and partial quenching of the absorption. Such shifts with site-specifically incorporated C−D bonds have been used to characterize the protonation/ deprotonation of several protein side chains, for example, several Lys residues of cyt c. Miller and Corcelli also investigated the sensitivity to protonation of CD2 stretching frequencies of Arg, Lys, Asp, and Glu using density functional theory (DFT).284 In each case, deprotonation (yielding the neutral species for Arg and Lys and the negatively charged species for Asp and Glu) caused both the symmetric and asymmetric stretches of a CD2 probe to shift to lower
5.4. H-bonding, Protonation, and Solvation
H-bonding is ubiquitous in biology. While most C−H bonds are inert, in some cases they can participate in H-bonding, and the use of C−D probes is ideal for the detection and characterization of such interactions. For example, dimerization of the transmembrane domains of glycophorin A (GPA) is thought to be favored by C−H···O H-bonds. To explore this possibility, Arbely and Arkin synthesized a peptide fragment corresponding to residues 70 to 101 of GPA with (d2)Gly79, P
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 16. (A) Illustration of an Ni+1−H···Ni H-bond. (B) Ramachandran plot with the bridge regions, δR and δL, where ϕ and ψ are centered around ±90° and 0°, respectively, as indicated; also indicated are allowed or partially allowed regions: αL and αR, left- and right-handed α-helices; βS and βP, β-sheets and polyproline-like β-sheets; γ and γ′, γ- and inverse γ-turns; ε, ε′, and ε″, extended regions with ϕ > 0°, ψ ≈ ±180°. (C) Spectral evidence of functional Ni+1−H···Ni H-bonds in nSH3 at residues Pro165 and Pro185 (only spectra of d7-labeled variants shown) but not at Pro152 nor Pro183 (spectra of both d3- and d7-labeled variants shown in red and black, respectively; red-shifted absorptions assigned as CδD2 Bohlmann bands are indicated with stars). The symmetric and asymmetric absorptions are labeled s and as, respectively. Reproduced with permission from ref 40. Copyright 2014 American Chemical Society.
frequencies. While these shifts were not interpreted in terms of changes in hyperconjugation, they are consistent with it. The use of C−D bonds as vibrational probes of protonation/ deprotonation of His side chains is another interesting application, because with a pKa near seven, His plays a particularly important role in acid−base catalysis at physiological pH. Miller and Corcelli examined a model His dipeptide in the gas phase using DFT as well as in aqueous solution using two-layered integrated molecular orbital and molecular mechanics (ONIOM) calculations.285 In both the gas and condensed phases, deprotonation of the ring nitrogen is predicted to induce a significant red-shift of both the Cδ−D and Cε−D absorptions (again consistent with hyperconjugation effects). Londergan and co-workers have experimentally characterized the Cε−D absorptions of model compounds and of a His residue in hen egg white lysozyme286 and demonstrated that deprotonation induces a significant red-shift (35 cm−1 in the protein). The symmetric CD3 stretching frequencies of (d3)Met in water−isopropanol mixtures appear to reflect the extent of solvation of the adjacent sulfur atom, again consistent with the primacy of hyperconjugative effects. Similarly, the spectra of (d4)Tyr100 in DHFR appear to reflect deprotonation and/or H-bonding-mediated changes in hyperconjugation, although in this case the effect is presumably mediated by conjugation.22
The sensitivity of C−D bonds to the electron density at an adjacent heteroatom has also been used to detect evidence of unusual Ni+1−H···Ni H-bonds in the backbone of a protein (i.e., an H-bond bond between an amide N and the N−H of the following residue; Figure 16A) that might be a common but overlooked contributor to protein structure and stability. In general, the contribution of individual residues to the mainchain structure of a protein is described by a Ramachandran plot, which displays the distribution of amino acid residues according to their ϕ and ψ angles269 (Figure 16B). The vast majority of residues fall within the “allowed” regions of this plot; however, residues are also commonly found to lie within the “bridge region”, where ϕ and ψ are centered around ±90° and 0°, respectively, which is traditionally considered unfavorable due to a steric clash between the Ni and Ni+1 amide nitrogens.269,287 Regardless of unfavorable sterics, the distance between the Ni and the Ni+1 hydrogen atom is close to the normal contact distance,269 and the Ni+1 proton appears positioned to interact with the Ni electron density. Moreover, the specific ϕ, ψ, and N−Cα−C torsion angles appear to be correlated in a manner that preserves this interaction.288,289 While the geometry of the Ni+1−H···Ni interaction appears to be typical of a bifurcated H-bond, its potential contribution to protein structure and stability has largely been ignored, likely due to the assumption that the amide nitrogen is a poor Hbond acceptor.290 Interestingly, there are four Pro residues in Q
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 17. (A) FTIR spectra of (CN)Phe in THF−H2O solutions (vol % of H2O indicated). These spectra were modeled (solid lines) globally by three Lorentzian functions, corresponding to water-like (a), THF-like (b), and water−THF (c). The bands that make up the fit for the 50% H2O solution are shown. (B) The CN stretching vibrational bands of (CN)Phe in water at different temperatures (3.2 to 87.2 °C in steps of 7 °C). Spectra were globally modeled by two Lorentzian functions (solid lines). Reproduced with permission from refs 51 and 292. Copyright 2003 American Chemical Society and Elsevier, respectively.
solvent-induced changes in the CN stretching absorption of (CN)Phe in water/tetrahydrofuran mixtures and found that the absorption broadened and blue-shifted by as much as ∼10 cm−1 with increasing percentage of water (Figure 17A).51 In addition, the CN stretching absorption in an aqueous environment was found to depend on temperature, and a two-dimensional correlation analysis and global deconvolution of the temperature-dependent spectra revealed the presence of two absorption bands.292 The high-frequency band (∼2233 cm−1) was favored at low temperature and was assigned to the Hbonded CN species, whereas the low-frequency band (∼2230 cm−1) was favored at higher temperature and was assigned to the free CN moiety (Figure 17B). Similarly, the IR spectra of the aromatic nitriles cinnamonitrile and benzonitrile in methanol are composed of two overlapping absorption bands: a low-frequency band similar to that observed in acetone or tetrahydrofuran and a high-frequency band similar to the band observed in water. In this case, 2D IR experiments identified cross peaks and showed that exchange occurred on the 4−5 ps timescale, which thus defined the timescale of H-bond formation and rupture.293 Londergan and co-workers68 conducted a systematic analysis of IR absorption line widths and peak frequencies of methyl thiocyanate (MeSCN) in several common solvents, including water and fluorinated alcohols, to investigate the dependence of the SCN stretching absorption band on the solvent environment. Each SCN absorption band was fit with a pseudo-Voigt function, and the resulting line-shape parameters (full width at half maximum (fwhm) and m factor) were compared to the previously reported average solvation time of the solvent.294 In most solvents, the absorptions were symmetric and the fwhm/m values were observed to decrease with decreases in . Thus, the authors concluded that fast solvation dynamics plays an important role in determining the shape and width of the SCN stretching absorption. In contrast, the line shapes in trifluoroethanol or hexafluoroisopropanol were markedly asymmetric and required two and three Gaussians, respectively, to fit. The bands also showed a strong temperature dependence and became both broader and more
nSH3, two of which adopt bridge-region structures (Pro165 and Pro185) and two of which do not (Pro152 and Pro183).291 To test whether the conformation of the bridge-region residues is favored by Ni+1−H···Ni interactions, each Pro was replaced with its fully deuterated counterpart ((d7)Pro), and the resulting spectra were compared with those of the d3-labeled protein ((d3)Pro), where only the Cα methine and C δ methylene groups were deuterated.40 Both the symmetric and asymmetric stretches of (d7)Pro152 and (d7)Pro183 showed multiple absorptions, and comparison with (d3)Pro152 and (d3)Pro183, as well as with the free amino acids, enabled unambiguous assignment of the lower-frequency absorption bands to the CδD2 absorptions (Figure 16C). In contrast, the IR spectra of both (d7)Pro165 and (d7)Pro185 showed only single absorption bands that were well-fit with single Gaussian functions, demonstrating that the CδD2 absorptions are undifferentiated from those of the other pyrrolidine ring CD2 absorptions, which was also confirmed by the (d3)Pro165 and (d3)Pro185 spectra. There are no consistent differences between these two groups of Pro residues other than the presence or absence of the putative Ni+1−H···Ni interaction. Thus, it was concluded that the CδD2 groups of Pro152 and Pro183 absorb at lower frequency due to hyperconjugation with the amide π orbital electron density, as well as that this electron density is not available at Pro165 and Pro185 due to engagement of their amide nitrogens by the proton of the preceding residue. This conclusion was further supported by DFT calculations and natural bond orbital (NBO) analysis of dipeptide mimics. Taken together, the experimental and computational work constitutes strong evidence that the Ni+1−H···Ni interactions observed in protein structures do indeed constitute functional H-bonds that likely contribute to protein structure and stability. The proclivity of (S)CN moieties to engage in H-bonds that are not native to the protein complicates the simple interpretation of their spectra in terms of the native protein (see sections 5.1 and 5.2). However, when positioned at the surface of a protein, it also makes them potentially useful probes of hydration. Indeed, Gai and co-workers studied R
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
showed peak frequencies ranging from 2228.5 to 2231.0 cm−1, which are significantly red-shifted compared to the CN absorption of ∼2237 cm−1 observed with fully hydrated peptides. On the basis of this data, the authors suggested that MPx binds to a hydrophobic portion of the interfacial region of the reverse micelle where there are relatively few water molecules available with which to interact. Increasing the water pool size (w0 = 20) did not change the nitrile peak frequencies, which further supported the hypothesis that the (CN)Phe side chains in MPx are generally shielded from solvation. Nonetheless, based on the extent of the red-shift relative to the free peptide in solution, the authors suggested that the side chains at positions 6 and 9, while not solvated, are directed toward the pool of water. Gai and co-workers also explored the use of (CN)Trp as a mimic of Trp by using it to replace Trp9 and Trp11 in indolicidin, a 13-residue membrane-binding cationic antimicrobial peptide,71 which commonly functions by selectively disrupting the more anionic bacterial membrane. Indolicidin is thought to be unstructured in aqueous solution but to adopt a wedge-shaped conformation upon interacting with model membranes. Consistent with the assumed structures, the absorption bands observed with the labeled peptides were very broad but narrowed in the presence of the model membrane, which the authors attributed to desolvation. Recently, Gai, DeGrado, and co-workers site-specifically examined the hydration status and dynamics of the “tryptophan gate” (Trp41) of the pH-gated influenza A M2 proton channel using (CN)Trp.124 The modified 25-residue transmembrane domain of the M2 protein (M2TM) was synthesized and examined in a model membrane and investigated with linear and nonlinear IR spectroscopy. Absorptions observed at 2119.8 and 2118.8 cm−1, at pH 7.4 and pH 5.0, respectively, and a 3 ps decay determined by 2D IR spectroscopy suggested that the environment is mostly free of water and that it is largely insensitive to pH. Finally, Gai and co-workers utilized (CN)Phe to study the effects of trimethylamine N-oxide (TMAO),295 a naturally occurring osmolyte that stabilizes the folded states of proteins and counteracts the denaturing effects of urea.296 Even though the molecular mechanisms of its stabilizing effects are debated, changes in water structure induced by TMAO are welldocumented.297 First, when incorporated into a model pentapeptide, the authors observed that the addition of TMAO caused the absorption of (CN)Phe to red-shift, which was ascribed to a weakening of the CN−water H-bond.295 Next, the authors used 2D IR spectroscopy to examine sitespecific hydration dynamics of (CN)Phe incorporated into a tripeptide or the HP35 peptide67 (Figure 18). The CN probe was used to replace Phe58, in the core of HP35, and Phe76, which is solvent-exposed. The center line slope (CLS) method developed by Fayer and co-workers298,299 was used to extract decay dynamics of the tripeptide on the 2 ps timescale with zero static offset, which indicated that the CN probe rapidly sampled all possible environments in the aqueous solution. In contrast, the addition of TMAO into the tripeptide caused the local H-bonding environment to fluctuate faster, broadened the frequency distribution of the CN absorption, and restricted the number of motions responsible for the nitrile’s frequency fluctuations. Interestingly, a significant static offset was observed at the buried position of HP35 but not at the surface position, and the magnitude of the offset increased with
symmetric at higher temperature. The spectrum observed in trifluoroethanol at each temperature was fit with two Gaussians with peak frequencies at 2172.9 and 2164.9 cm−1 with respective fwhm’s of 14.5 and 18.7 cm−1. The amplitude of the low-frequency band increased at the expense of the highfrequency band with increasing temperature. With additional support from a two-dimensional correlation analysis of the temperature-dependent IR spectra, the authors assigned the low-frequency absorption to non-H-bonded SCN and the highfrequency band to the H-bonded SCN. Although the changes were more complicated with hexafluoroisopropanol, detailed analysis again suggested the presence of a high-frequency band favored at low temperature and assigned to the H-bonded species and a low-frequency band favored at high temperature and assigned as the free CN species. On the basis of these observations, the authors suggested that SCN probes may be used to site-specifically characterize the dehydration of peptides and proteins induced by fluorinated alcohols. The effects of temperature on the IR spectra of MeSCN in several protic and aprotic solvents were examined by Boxer and co-workers.55 The observed stretching frequency shifted to the blue in H-bond-donating solvents with decreasing temperature. Interestingly, MeSCN in ethanol showed two overlapping absorption bands, which became more resolved at low temperature. In addition, with decreasing temperature the high-frequency band shifted to the blue, whereas the lowfrequency band shifted to the red. The stretching frequency of MeSCN in non-H-bond-donating solvents such as chloroform, acetone, and 2-methyltetrahydrofuran shifted to the red with decreasing temperature. Consequently, the blue-shift of the high-frequency band of MeSCN in ethanol was attributed to the increased strength of the H-bond at low temperature, whereas the red-shift of the low frequency band was attributed to increases in the magnitude of attractive potentials of solvent dipoles at low temperature due to decreased thermal disorder and increased material density. To explore the use of CN probes as site site-specific reporters of hydration, Gai and co-workers used (CN)Phe to replace Trp581 or Lys583 in the 17-residue calmodulin (CaM) binding peptide from skeletal muscle myosin light chain kinase, which binds CaM in the presence of Ca2+.51 The IR spectrum of (CN)Phe583 in the free peptide was similar to (CN)Phe in water and did not change upon binding CaM, where it is predicted to remain solvent-exposed. In contrast, while the CN stretching frequency of (CN)Phe581 in the free peptide was again similar to that of (CN)Phe in water, it was red-shifted by 7 cm−1 when complexed to CaM, where it is predicted to be protected from solvent. The same group then synthesized seven variants of the 14-residue amphipathic membrane-binding peptide mastoparan x (MPx) with (CN)Phe incorporated at positions 5 to 11.70 Each free (CN)Phe-labeled MPx variant in water exhibited an absorption peak frequency at ∼2235 cm−1 with a line width of ∼13 cm−1, which indicated that the probes are fully hydrated in the unbound MPx; however, in the presence of a phospholipid bilayer, the absorptions were redshifted and narrowed. Thus, the authors concluded that MPx is bound in the hydrophobic region of the model membrane and dehydrated. The same authors further investigated the sitespecific hydration status of MPx entrapped in a reverse micelle at a low water pool size (w0 = 6) as a mimic of a membrane− water interface.69 In this case, (CN)Phe was incorporated at positions 5, 6, 7, or 9, or appended to the C-terminus. The CN stretching absorption of each MPx variant in the reverse micelle S
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 18. Effect of TMAO and urea on the site-specific hydration dynamics of (CN)Phe incorporated into the HP35 peptide at the positions indicated. Reproduced with permission from ref 67. Copyright 2014 National Academy of Sciences.
addition of TMAO, consistent with crowding effects and slower fluctuations in an increasingly rigid protein interior. Charkoudian and co-workers cyanylated the terminal thiol of the Ppant arm of two different acyl carrier proteins (ACPs), one from the Streptomyces coelicolor actinorhodin polyketide synthase (ACT ACP) and one from the Geobacter metallireducens lipopolysaccharide biosynthetic machinery (GmACP3).66 These ACPs differ in the proposed conformation of their Ppant arms, which, based on NMR studies, were thought to be solvent-exposed in the case of ACT ACP300 and buried in a hydrophobic cavity in the case of GmACP3.301 In agreement with the NMR data, the SCN probe of ACT ACP displayed a comparatively broad absorption spectrum with a peak frequency that was blue-shifted 6−7 cm−1 compared to the narrower absorption band of the probe in GmACP3 (Figure 19). In addition, on the basis of the difference in the line widths, the authors concluded that the SCN probe on the Ppant arm of ACT ACP experiences a heterogeneous solvent environment, whereas that of GmACP3 experiences a relatively homogeneous environment. Finally, the authors cyanylated the Ppant arm of an uncharacterized ACP from 6-deoxyerythronolide B polyketide synthase (DEBS ACP2) and showed that its environment was similar to that of the ACT ACP. The N3 transparent window vibrational probe has also attracted attention as a probe of solvation. Londergan and coworkers comprehensively investigated the solvatochromism of the N3 stretching absorption band of two model compounds, the aliphatic 5-azidopentanoic acid and the aromatic 3-(pazidophenyl)propanoic acid, in a large variety of solvents to examine the sensitivity of the probe to local electrostatics, Hbonding, and, in particular, the presence of local water molecules.81 They found that the aliphatic N3 stretch does not change with solvent polarity, whereas the aromatic N3 stretch shows a weak but measurable sensitivity to polarity. However, both N3 stretches were blue-shifted in H-bonddonating solvents and, interestingly, showed greater shifts with water than with the stronger H-bond-donating solvents trifluoroethanol and hexafluoro-2-propanol. Further analysis suggested that the large shifts induced by water were due to the presence of a high density of local H-bond donors. This fascinating observation suggests that these probes may be useful for the characterization of dynamic water networks, which are commonly evoked as functionally important.302−304 Additionally, from investigations of the sensitivity of the N3 stretch in
Figure 19. Structure and IR absorption spectra of acyl carrier proteins GmACP3, ACT ACP, or DEBS ACP2 with cyanylated Ppant arms (PDB IDs 2LML, 2K0X, and 2JU1, respectively); DEBS ACP2 Ppant arm has been added for reference. Reproduced with permission from ref 66. Copyright 2014 American Chemical Society.
several mixed solvents, the authors concluded that the N3 moiety is a specific sensor of hydration even in the presence of higher concentrations of other H-bond donors. Brewer and co-workers incorporated 4-azidomethyl-L-phenylalanine ((p-N3CH2)Phe; see Figure 3) at two positions of sfGFP, replacing Tyr75 or Asp134, and showed that the N3 stretching frequency was sensitive to its environment.215 The probe at Tyr75 showed a symmetrical IR absorption band at 2094.3 cm−1, while at Asp134 it showed a slightly asymmetric absorption band at 2109.8 cm−1. The peak frequency at position 75, which based on the structure is expected to be buried in a hydrophobic environment, was similar to that of free (p-N3CH2)Phe in DMSO, whereas the peak frequency at position 134, which is expected to be solvent-exposed, was similar to that of free (p-N3CH2)Phe in water. In a second study, Tookmanian, Fenlon, and Brewer replaced another residue on the surface of sfGFP, Asn150, with (p-N3CH2)Phe and found a single absorption band with a peak maximum at 2107.3 cm−1.216 Thus, the N3 stretching frequency at position 150 is 2.5 cm−1 red-shifted compared to that at position 134. Because both sites are on the surface of the protein, the differences in their peak frequencies were attributed to differences in hydration states. Finally, Tookmanian, PhillipsPiro, Fenlon, and Brewer incorporated 4-(2-azidoethoxy)-Lphenylalanine (AePhe; see Figure 3) at positions 133 and 149 of sfGFP via amber suppression.305 Both positions are again on the surface of the protein, and as expected, the absorption at position 133 was similar to that of the free amino acid (2118.0 cm−1). However, at position 149, the N3 stretching absorption was observed at 2115.5 cm−1, which prompted the authors to suggest that the probe is not fully solvated. To test this hypothesis, the authors solved the crystal structure of the protein with the probe at position 149, which showed that the T
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
ics.128−130 Both the single- and double-labeled variants showed the same initial picosecond decay and a significant static offset, with the latter indicating the presence of slower dynamics. In addition, they examined a (CN)Phe-labeled double norleucine (Nle) mutant that is more stable than wild-type HP35 (Figure 21). Interestingly, the more-stable variant showed substantially slower dynamics, suggesting a more tightly packed hydrophobic core.
probe is disposed into a surface crevice that protects it from solvation (Figure 20B). This study solidifies the previous
Figure 20. (A) Structure of sfGFP with AePhe incorporated at the surface position 149 (side chain shown). (B) Detailed image of AePhe conformation where the protein provides an environment that shields N3 from solvation, despite position 149 being located on the surface of the protein (PDB ID 5EHU).
suggestion that, even if the N3 probe is incorporated at a position expected to be solvent-exposed, it may eschew solvation if it is attached to a flexible side chain and a suitable environment is provided by the surface of the protein;7 it also emphasizes that either the tandem NMR-IR or FTLS method should be used to demonstrate that an (S)CN (or N3) does indeed H-bond with water before it can be used as a probe of solvation. 5.5. Nonlinear IR Spectroscopy and Protein Dynamics
When combined with nonlinear IR spectroscopy, transparent window vibrational probes provide a powerful tool for the characterization of protein motions. In a three-pulse vibrational echo (3PVE) experiment, a phase grating is inscribed in the sample that subsequently decays due to interconversion of the distinct microenvironments that lead to inhomogeneous broadening of a given vibrational absorption peak. Probing the phase grating after a waiting time T thus yields information about the timescale of dynamics. The more information-rich 2D IR experiment measures the frequency−frequency correlation function (FFCF) of the system before and after a variable waiting time T, which reveals couplings and population transfer between different vibrational states, as well as the time evolution of the inhomogeneous broadening mechanisms of a given vibrational state, which is conveniently analyzed using CLS analysis.298,299 The lower and upper limit of timescales of both techniques is dictated by the pulse duration of the laser and the vibrational lifetime of the probe, respectively. Thus, protein motions that occur on timescales longer than the lifetime of the IR probes appear as a static offset in the FFCF. The Fayer lab has used transparent window vibrational probes in a variety of nonlinear IR experiments. They replaced Phe43 in myoglobin with (N3)Phe via amber suppression to examine dynamics near the heme redox center.213 The CLS data of the free protein or the protein bound to CO (a mimic of the natural O2 ligand) revealed similar decay constants of ∼1.5 ps, but the binding of CO was shown to reduce the static offset, suggesting that it restricted active site motions, with potentially important implications for the mechanism of allostery in the structurally related hemoglobin tetramer. They also incorporated one or two (CN)Phe probes into the HP35 peptide using SPPS to investigate whether the nitrile transparent window probe is suitable for characterizing fast structural dynam-
Figure 21. Structure and dynamics of HP35 peptide. (A) Structure of HP35 peptide with (CN)Phe incorporated at position 58 (side chain shown). (B) CLS data for the CN stretching mode of wild-type and Nle double mutant. Reproduced with permission from ref 128. Copyright 2012 American Chemical Society.
SCN transparent window probes have also been used in conjunction with nonlinear IR techniques, albeit less frequently than N3 or CN. For example, Cheatum and co-workers prepared a variant of E. coli DHFR by cyanylation of the two native cysteine residues and reported the 2D IR spectra of the protein at a waiting time of 1 ps, which displayed significant inhomogeneity.228 Bredenbeck and co-workers used 2D IR spectroscopy to investigate hemoglobin that had been cyanylated at its two cysteine residues via reaction with DTNB and potassium cyanide and found that the CLS decay was complete within 5 ps;306 however, the origins of this fast dynamics remain to be investigated. To increase the lifetime of the probe so that longer timescale dynamics can be observed, SeCN probes have also been explored, but only with small molecules.307−309 To date, 2D IR experiments using C−D probes have been performed only with small-molecule model systems.310−312 In addition, Hamm and co-workers used C−D labels to selectively deposit vibrational energy in a short helical peptide and followed its evolution along the peptide backbone with siteselectively incorporated CN probes.34 5.6. Protein−Inhibitor Interactions
Interestingly, a variety of protein inhibitors possess CN groups, which thus may be used as intrinsic transparent window probes U
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
protected from water molecules and in close proximity to the side chain of residue Thr113. Interestingly, the CN group gave rise to two IR absorption bands separated by ∼14 cm−1, and as with the solvent studies described above, the high-frequency band was assigned to an H-bonded species (involving the sidechain H-bond donor of Thr113) and the low-frequency band was assigned to a H-bonded free species. Consistent with these assignments, mutation of Thr113 to Ser resulted in little change to the spectrum while mutation to Ala resulted in the ablation of the high-frequency absorption. MD simulations predicted that the two peaks arise from dynamic switching between two different conformations of Thr113, one which engages in the H-bond with CN and one that does not (and instead is engaged in a H-bond with its own amide carbonyl). The CN group of bosutinib (Figure 23A), a BCR-Abl and Src-family kinase inhibitor, was also recently exploited to study
without any risk of perturbation. In several cases, such CN groups have been used to probe the electric field of a protein.313−315 In other cases, the propensity of CN groups to engage in H-bonding is thought to actually contribute to inhibitor binding, and their spectroscopic characterization is an ideal means to test and quantify the contribution. In a first example, the dynamics of HIV1-RT complexed with rilpivirine, a non-nucleoside reverse transcriptase inhibitor that possesses a benzonitrile-like CN group as well as a cinnamonitrile-like CN group (Figure 22A), was investigated by Hochstrasser and co-
Figure 22. Structure of the transparent probe containing inhibitors rilpivirine (A) and (5-chloro-2-{[(4-cyano-3-nitrobenzyl)amino]carbonyl}phenoxy)acetic acid (B).
workers using 2D IR spectroscopy.316 Free in solution, rilpivirine shows a single absorption band, but when complexed to HIV1-RT, it shows two well-resolved absorption bands, one at ∼2215 cm−1 and one at ∼2227 cm−1, which based on small molecule studies were assigned to the cinnamonitrile-like and benzonitrile-like CN groups, respectively. 2D IR analysis of the higher-intensity, low-frequency band yielded a frequency correlation function that consisted of 130 fs and 7 ps decay components. On the basis of a 1.8 Å resolution crystal structure, which revealed that the cinnamonitrile arm occupies a hydrophobic tunnel of the enzyme with a CN group that is Hbonded neither to water nor to the protein, the authors assigned the fast decay to fluctuations of polar side chains at the protein−water interface and the slower decay to motions of the inhibitor. In a follow-up study, the same authors identified a ∼1 ps correlation time for the lower-intensity, high-frequency band.241 While this decay was consistent with H-bond dynamics, analysis of the previously reported crystal structure revealed no candidate water molecules. This prompted the authors to determine the structure of the complex at higher resolution (1.5 Å), which interestingly revealed a water molecule within H-bonding distance (2.8 Å) to the cinnamonitrile-like CN and, along with MD simulations, supported the reassignment of the high-frequency band with its 1 ps dynamics to this CN group, leaving the low-frequency band to be associated with the benzonitrile-like CN, with its 130 fs and 7 ps decays associated with polar side-chain motions and with more global fluctuations of its protein environment, respectively. The authors suggested that the interaction between the cinnamonitrile-like CN and an active-site water may be important for inhibition and, interestingly, that this interaction may contribute to the retention of activity against mutant reverse transcriptases. Boxer and co-workers similarly characterized the CN group of (5-chloro-2-{[(4-cyano-3-nitrobenzyl)amino]carbonyl}phenoxy)acetic acid (Figure 22B), which inhibits human aldose reductase (hALR2).242 Structural characterization of the ternary complex with inhibitor and NADP+ indicated that the CN group is buried in the hydrophobic binding pocket, well-
Figure 23. (A) Structure of the Src-family kinase inhibitor bosutinib. (B) Detailed view of bound bosutinib with the water-mediated Hbonding network indicated (PDB ID 4MXO).
binding specificity.317 Structural data revealed that, when bound to the Src kinase, the nitrile group, which is attached to bosutinib’s quinolone ring, is oriented into the ATP binding site where it H-bonds to one of two ordered water molecules (Figure 23B). To address whether this interaction contributes to specificity, the authors collected IR spectra of bosutinib bound to different Src kinases with mutations in the ATP binding site selected based on the residues present at the corresponding positions of other human kinases and found that the CN absorption frequency varied by 13 cm−1. Interestingly, mutation of the gatekeeper residue Thr338 and residue Ala403 produced the largest shifts, which were to the red and up to 11 cm−1 (Thr338Met/Ala403Thr), which suggested that these mutations result in CN group dehydration. To further confirm that the red-shift is indeed caused by loss of H-bonding, and that the H-bond is functionally important, the authors obtained the crystal structure of the bosutinib complexed with either the Thr338Met/Met313Leu double mutant (mutation of Met313 to Leu is not expected to affect binding but was found to facilitate crystallization) or the Ala403Thr single mutant of Src kinase. In both structures, the bosutinib nitrile group was not engaged in a H-bond and, importantly, the inhibitor bound with much less affinity, leading the authors to conclude that the CN does indeed interact with an active-site water molecule in V
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
the wild-type kinase and that this interaction is important for inhibitor specificity. The azide ion itself binds to the metal center of a variety of metalloproteins. Carbonic anhydrase II is one such protein, and a 3PVE characterization of the bovine variant revealed fast and slow components of 0.2 and 17 ps, respectively, with the former assigned to fluctuations of the side chain of the proximal and highly conserved Thr199.318 Cheatum and co-workers analyzed the wild-type as well as the Thr199Ala and Leu198Phe mutants of the human protein.319 The wild-type and the Thr199Ala mutants, which remove a side-chain-mediated H-bond with Glu106 as well as a potential interaction with the bound azide but which preserve an amide backbone-mediated H-bond with the azide, both yielded similar correlation functions composed of a fast subpicosecond decay component and a static component. However, mutation of Leu198 to Phe resulted in a correlation function that retained the subps decay but gained a 2.5 ps component and lost the static component. On the basis of this data, the authors concluded that the Thr199 side chain does not strongly interact with the azide, and they assigned the subps and static components of the wild-type and Thr199Ala enzymes to H-bond fluctuations and conformational heterogeneity, respectively. Moreover, the absence of the static component in the Leu198Phe variant was attributed to increased restriction of the azide by the Phe side chain and a stronger azide H-bond with the Thr199 backbone amide. The presence of a stronger H-bond was suggested to be consistent with the 6 cm−1 red-shift of the azide absorption when bound to the Leu198Phe mutant relative to the wild-type enzyme. 3PVE spectroscopy has also been used to investigate the dynamics of the ternary complex of formate dehydrogenase (FDH), azide, and its NAD+ cofactor321 (Figure 24). The observed correlation function decayed with 250 fs and 3 ps components, and no static component was observed. The absence of longer timescale dynamics led the authors to propose that the complex is relatively rigid. To further explore this hypothesis, the authors expanded their study to include the binary complex with N3 and the ternary complex with N3 and NADH, using 2D IR spectroscopy. Interestingly, the FFCF of the binary complex, and only the binary complex, showed a significant static component. A separate 2D IR study of the binary complex with the azide-bearing NAD+ analogue, 3picolyl azide adenine dinucleotide, similarly revealed a large static offset, as well as picosecond fluctuations.322 As the ternary complex with azide is often taken as a mimic of the natural transition state, the authors concluded that the transition state is rigid and that the catalytically incompetent binary complexes are less rigid. Recently, an underdamped oscillation in the FFCF on the picosecond timescale was observed in the ternary complex with azide and NAD+.323 Such oscillations may result from either coherent excitation of coupled vibrational modes or fluctuations of H-bonds formed with protein-based H-bond donors. Interestingly, the oscillations were not observed in the complex with NADPH, suggesting that they may be associated with motions of the charged nicotinamide ring.
Figure 24. Characterization of FDH complexed with azide and NAD+ or NADH, or complexed with azide alone, using 2D IR spectroscopy. (PDB ID 2NAD) Reproduced with permission from ref 320. Copyright 2010 National Academy of Sciences.
tionally suffer from an inability to visualize specific molecules of interest, and the use of transparent window vibrational probes has attracted attention in this field as well. In addition to C−D, CN, and N3 probes, alkyne moieties, which provide a stretching absorption around 2125 cm−1, have also been employed. The use of the C−D probe is attractive for microscopy studies because of its strictly nonpertubative nature, and it was again the first probe employed, in this case with spontaneous Raman microscopy to visualize cellular lipids.329,330 Soon after, Otto and co-workers grew HeLa cells in media containing (d5)Phe, (d4)Tyr, or (d3)Met and then used nonresonant Raman microspectroscopy to detect the locally concentrated
5.7. Vibrational Imaging
Vibrational imaging techniques, such as those based on coherent anti-Stokes Raman scattering (CARS),324 stimulated Raman scattering (SRS),325 and surface-enhanced Raman scattering (SERS),326 have become important tools in cell biology for the visualization of cellular components with submicrometer resolution.327,328 However, these studies tradiW
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
C−D signals associated with incorporation into a protein.331 As these amino acids are hydrophobic, they are generally expected to be incorporated within the hydrophobic core of proteins, and correspondingly, the authors observed red-shifts of 5−9 cm−1 relative to the corresponding amino acids in solution. The authors also demonstrated that C−D-labeled proteins could be visualized in the nucleolus of the cells following an 8 h incubation with (d4)Tyr and further noted that, because the Raman band for (d3)Met (2128 cm−1) is well-separated from those of (d5)Phe (2292 cm−1) or (d4)Tyr (2282 cm−1), multiplexing Raman experiments should be possible. For imaging applications, the limitations of the low cross section of the C−D probe may be at least partially overcome with the use of SRS microscopy.325 In SRS, pump and Stokes laser beams are spatially and temporally focused on the sample and amplification of the Raman signal occurs by stimulated excitation of the vibrational transition when the difference in Stokes and pump frequencies matches a specific absorption. Min and co-workers utilized SRS microscopy to visualize nascent proteins in HeLa cells after 20 h incubation with (d10)Leu.332 To further enhance the C−D Raman signal, cells were incubated with a set of all 20 labeled amino acids, which increased the C−D signal 5-fold compared to incubation with the single amino acid alone (Figure 25). To demonstrate
amino acids were concentrated in the core, while proteins harboring the first group of amino acids were distributed throughout the entire volume of the aggregate. The authors interpreted this pattern as evidence that the core of the aggregate formed earlier during growth and that proteins produced later permeated throughout the entire aggregate. SERS microscopy has emerged as powerful surface-sensitive technique that dramatically enhances Raman signals when the excitation laser frequency and the emitted Raman photons are resonant with the localized surface plasmon resonance of a metallic nanostructure.326 Because SERS enhancement drops significantly with the probe-to-surface distance, the technique is well-suited for probing cell surface-bound biomolecules. Recently, Chen, Tian, and co-workers exploited this strategy for the detection of cell surface glycans,334 which mediate, among other things, cell−cell communication, the immune response, and host−pathogen interactions, but which are difficult to detect by spontaneous Raman microscopy due to their low local concentration inside the focal volume of the instrument. HeLa cells were treated with peracylated Nazidoacytelmannosamine, peracylated N-(3-cyanopropanoyl)mannosamine, or peracylated N-trideuteroacetylmannosamine to incorporate probe-bearing glycans, and then were incubated with glycan-binding 4-mercaptophenylboronic acid-functionalized gold plasmonic nanoparticles. Strong N3, CN, or C−D probe SERS absorptions were observed at ∼2130, ∼2220, and ∼2125 cm−1, respectively. The same group then demonstrated the successful imaging of HeLa cell-surface glycans that were metabolically labeled with 9-azido sialic acid or N-trideuteroacetylneuramic acid on Au nanoparticle arrays or on an Ag nanoparticle film.334 Newly synthesized proteins in HeLa cells incubated with Aha were also imaged on SERS-active silicon wafers. It was also demonstrated that newly synthesized Ahalabeled bacterial cell-surface proteins can be visualized by SERS microscopy on an Ag nanoparticle film. While the alkyne moiety has not yet been used as a transparent window vibrational probe for the study of a purified protein, it has been used for live cell Raman imaging.112−115,335 As with the C−D, (S)CN, and N3 probes, an alkyne provides an absorption in the transparent window (around 2125 cm−1), and the absorption has a relatively large Raman cross section due to its high polarizability, making it suitable for even spontaneous Raman studies. For example, Sodeoka and coworkers used slit-scanning Raman microscopy for real-time imaging of HeLa cells incubated with the thymidine nucleotide analogue, 5-ethynyl-2′-deoxyuridine (EdU), which is incorporated into DNA by cellular polymerases.112 Later, Min and coworkers reported a general strategy to employ SRS microscopy to visualize newly synthesized DNA, RNA, and protein in live cells by addition of EdU, 5-ethynyl uridine (EU), and Lhomopropargylglycine, respectively.113 To further expand the utility of the alkyne transparent window vibrational probe, they performed live cell, three-color SRS imaging of DNA using three distinct forms of 13C-labeled and alkyne-tagged EdU.114 Specifically, incorporation of one or two 13C-labels into EdU gave rise to DNA with Raman peaks at 2048 and 2077 cm−1, respectively, which when combined with the use of unlabeled EdU provided three resolvable Raman peaks that facilitate multiplex Raman imaging. Finally, Chen, Huang, and coworkers demonstrated SRS imaging of biomolecules including DNA, protein, and glycan in live cells using alkyne probes.115
Figure 25. SRS image of HeLa cells analyzed using C−D (2133 cm−1) probes; arrows indicate regions of nascent proteins. Relative to amide I (1655 cm−1), CH2 (2845 cm−1), and CH3 (2940 cm−1) vibrations, the C−D stretching absorptions provided maximum image contrast of nascent proteins. See text for details. Reproduced with permission from ref 332. Copyright 2013 National Academy of Sciences.
generality, the authors imaged nascent proteins in another human cell line (HEK293T), in a mouse cell line (N2A), and in the brain tissues of live zebrafish and mice.333 Finally, the authors explored the imaging of two sets of temporally differentiated proteins based on the Raman peaks of C−D probes using two-color pulse-chase SRS imaging.333 To do so, they divided the labeled amino acids into two groups: the first with low-frequency absorptions (down to 2067 cm−1) included perdeuterated Leu, Ile, and Val; the remainder were combined into a second group with higher-frequency absorptions (∼2133 cm−1). The aggregation-prone mutant huntingtin protein was expressed in HeLa cells in unlabeled media for 4 h, then in media containing the second group of deuterated amino acids for 22 h, and finally in media containing the first group for 20 h. Aggregates imaged with SRS at 2067 and 2133 cm−1 revealed that proteins specifically labeled with the second group of X
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 26. (A) Mechanism of the isomerization reaction catalyzed by P. putida KSI, shown for the substrate 5-androstene-3,17-dione. (B) Structure of KSI (PDB ID 1OH0) complexed with substrate mimic equilenin showing sites of (S)CN labeling (see text). (C) Spectra of SCN at position 116 (of the Asp40Asn mutant) of KSI with bound 4-F-3-Me-phenol (pKa = 9.8) showing a single absorption at 298 K and a doubled absorption at 80 K. Reproduced with permission from refs 57 and 243. Copyright 2012 and 2013 National Academy of Sciences, respectively.
probe and its bound water molecule.339 Met105 is slightly more removed from the oxyanion hole, and the SCN probe at this position is free of H-bonding due to its burial within the core of the protein, which is also consistent with the relatively narrow line width observed. The probe at Phe86 is also buried within the core of the protein, but X-ray crystallography revealed that it is H-bonded with the amide proton of Asp103.57 The relatively broad line width observed at this position was attributed to probe motion within the space created by replacement of the bulkier Phe side chain. Moreover, QM/ MM calculations reported by Layfield and Hammes-Schiffer accurately reproduced the equilenin binding-induced 1.8 cm−1 blue-shift and associated it with a tightening of the H-bond between the SCN probe and Asp103.339 Leu61 is positioned within the active site, and thus the probe incorporated at this position is exposed to waters and H-bonded in the apo state but not in the liganded state. When corrected for H-bonding, the SCN probe at the different positions examined shows a 5.4 cm−1 variation in absorption frequency, which corresponds to an 8 MV·cm−1 variation in the electrostatic field projected onto each probe. This demonstrates that KSI provides a strong and remarkably heterogeneous electric field, which would be impossible to detect without the high spatial resolution afforded by the transparent window vibrational probes. Moreover, reversion of Asn40 to the wild-type Asp revealed that this heterogeneity is sensitive to the introduction of the catalytically important charge, which resulted in variable shifts, ranging from 0.8 to −1.7 cm−1 (for the probe at position 86 and 116, respectively). The same SCN-labeled proteins were used in UV-pump IRprobe experiments to elucidate the time-resolved response to an electronic perturbation that is thought to resemble the charge displacement that occurs during catalysis.226 Specifically,
6. CASE STUDIES 6.1. Ketosteroid Isomerase
KSI is a homodimeric enzyme with each ∼28 kDa monomer including three α-helices and a six-stranded β-sheet. Its hydrophobic active site catalyzes the conversion of 3-oxo-Δ5 ketosteroids to their isomeric Δ4-conjugated isomers (Figure 26A). Remarkably, KSI accelerates this isomerization by 11 orders of magnitude, making it one of the most efficient enzymes known.336 The substrate binds with its carbonyl group H-bonded within an oxyanion hole formed by the side chains of Asp103 and Tyr16, with the latter H-bonded to the side chain of Tyr57, which is also H-bonded to the side chain of Tyr32 (Pseudomonas putida numbering). It has been suggested that electrostatic complementarity makes an important contribution to KSI catalysis.337 To test this hypothesis and directly probe the electrostatic environments within KSI, Hershlag, Boxer, and co-workers produced SCN-labeled variants of the P. putida enzyme via cyanylation of Cys residues introduced into an otherwise cysteine-free variant generated by mutating the native cysteine residues to Ser (and also containing the Asp40Asn mutation) (Figure 26B).57 At position Met116, the introduced SCN probe is proximal to the oxyanion hole and is solvent-exposed in both the apo and liganded states. Consistent with tandem IR-NMR experiments that suggest that the probe is H-bonded at this position, a proximal ordered water molecule is apparent in the highresolution crystal structure.338 Nonetheless, binding of the dienolate intermediate analogue equilenin induced a 2.8 cm−1 blue-shift. Using quantum mechanics/molecular mechanics (QM/MM) calculations, Layfield and Hammes-Schiffer were able to closely reproduce this shift and attribute it to a ligandinduced increase in linearity of the H-bond between the SCN Y
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
showed two peaks in the case of an intermediate pKa (∼10) (see Figure 26c). Finally, the spectra for the SCN probes at positions 105 and 116 predicted by assuming sequential Tyr deprotonations qualitatively agreed with the observed spectra. However, the predicted and observed spectra of the SCN probe at position 86 were not in agreement, which the authors suggested may arise from probe dynamics within the unfilled pocket created by removal of the larger Phe side chain. Because this probe also experiences a unique H-bonding environment involving the protein backbone, as opposed to water, and given the sensitivity of the predicted effects on H-bonding geometry,240 it also seems possible that slight alterations in the H-bonding geometry, too subtle to be observed in the X-ray structure, may contribute to the disagreement.
the transient IR signal of each probe was monitored after electronic excitation of bound coumarin 183. After an instantaneous change in frequency due to the altered electronic structure of the coumarin, the pump−probe signals showed no further shifts and simply decayed back to their unperturbed values as the coumarin relaxed back to its ground state. In addition, polarization-dependent measurements for three of the probes revealed that there is no significant reorientation between the probes or the chromophore after photoexcitation (due to the probe’s nearly orthogonal orientation relative to the coumarin at position 105, it was insufficiently sensitive to estimate anisotropy values). The data clearly suggest that the KSI active site, at least on the ps timescale and along the degrees of freedom sampled by coumarin 183 excitation, is relatively rigid, which suggests that catalysis is promoted by preorganization of the active site dipoles. Interestingly, coumarin 183 itself possesses a CN group that is exposed to solvent when bound to KSI, which Boxer, Gaffney, and co-workers used to characterize active-site water dynamics.340 When coumarin 183 is free in solution, the CN absorption shifts dramatically to the red (55 cm−1) upon photoexcitation. This red-shift decreases monoexponentially by ∼22 cm−1 with a time constant of ∼12 ps, which was attributed to reorganization of the water−nitrile H-bonding. When it is bound to KSI, photoexcitation induces a similar instantaneous red-shift, but in this case a small subsequent monoexponential increase of the red-shift by ∼3 cm−1 was observed with ∼80 ps time constant, which was suggested to result from a rigid solvation environment that resists reorganization in response to the changes in the charge distribution of the coumarin. The authors proposed that the rigidity of the water molecules within the active site of KSI facilitates its organized electrostatic environment. The transition state of the catalyzed reaction is associated with increased electron density on the substrate carbonyl oxygen, which is stabilized by H-bonding within the oxyanion hole. This charge migration may be recapitulated by increasing the pKa of phenolate substrate analogues, and a sufficiently high pKa actually results in proton transfer from an oxyanion hole proton donor to the substrate analogue. Due to decreased charge repulsion, the same ionization appears to occur spontaneously in the Asp40Asn mutant with a pKa of 5.5− 6,57,337 and the proximity of the aforementioned SCN probes to the oxyanion hole suggested that they would be useful for characterizing the residues involved. Indeed, upon increasing pH, the absorptions of the probes at positions 86, 105, and 116 at least roughly recapitulate the shifts induced upon mutation of Asn40 to Asp. Quite intuitively, the active-site proton donor was initially assumed to be Asp103;337,341 however, calculations based on this assumption were unable to even qualitatively reproduce the transparent window probe data. In contrast, assuming that one of the active-site Tyr residues was ionized resulted in good agreement. That an active site Tyr was indeed the source of the proton was later conclusively demonstrated using UV/vis and 13C NMR spectroscopy.57 Subsequent studies demonstrated that, as the pKa of the substrate analogue is increased, Tyr16 ionizes, and if the pKa is increased further, the H-bonding network rearranges such that Tyr57 bears the negative charge. This result was further supported by the lowtemperature spectra of the SCN probe at position 116, which showed a single peak when the bound substrate analogue had a low pKa or a high pKa (corresponding to the negative charge centered at the substrate oxygen and Tyr57, respectively) but
6.2. SH3 Domains
The Src-homology 3 (SH3) domain is one of the most abundant domains in human and other eukaryotic proteomes, and it mediates a range of signaling processes via intra- or intermolecular recognition of different proline-rich polypeptides (Figure 27A).342 The secondary structure of the SH3 domain contains a five-stranded β-barrel with the individual strands connected by three loops, referred to as the RT, n-Src, and distal loops, and by a single turn of a 310-helix,343,344 with the RT and n-Src loops forming the binding site (Figure 27B).345,346 Interestingly, conformational heterogeneity has been observed in several SH3 domains, for example, the Nterminal SH3 domain of the Drosophila signal adapter protein drk has been found to exist in an equilibrium mixture of folded and largely unfolded forms that exchange slowly on the NMR timescale.347 It has been suggested that this conformational heterogeneity is important for function.348−355 Crk-II is a signaling adaptor protein that contains two SH3 domains, one at its C-terminus and one at its N-terminus.356 We prepared the N-terminal domain, nSH3, via SPPS with its polypeptide backbone site-specifically labeled with (d2)Gly.24 Three of the four native Gly residues are located in the RT (Gly145 and Gly156) or distal (Gly177) loops, and one is located in the antiparallel β-sheet core (Gly180). The expected single symmetric and asymmetric absorptions were observed for each site within the RT or distal loops, but (d2)Gly180 clearly showed doubled absorptions (Figure 27C), which is evidence of conformational heterogeneity. However, an extensive analysis of C−D probes incorporated at 11 other positions throughout the protein yielded little evidence of additional heterogeneity.25 Likewise, when Phe143, Phe153, or Tyr186 were replaced with (CN)Phe, the spectra showed single absorption peaks in each case. However, when the same Phe residues were replaced with (N3)Phe, the spectra required two pseudo-Voigt functions for fitting, and when Met181 or Leu159 was replaced with Aha, multiple discrete absorptions were observed. While this spectral complexity could be interpreted as evidence of heterogeneity, its absence when the same residues were characterized with C−D or CN probes suggests that it is introduced by the probe itself, perhaps via perturbation, Hbonding, and/or Fermi resonances. Thus, overall, the transparent window probe studies of nSH3 revealed the presence of both flexible and more rigid regions. In general, the C−D absorptions within the folded protein showed large and position-dependent red-shifts of up to 9 cm−1 relative to the corresponding free amino acids. In addition, while the spectrum of (d8)Phe153 in the unfolded state exhibited the three absorption bands expected for local Z
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 27. (A) SH3 domains mediate molecular recognition and signal transduction via recognition of the PXXP peptide fragments of other proteins. (B) Structure of nSH3 (PDB ID 1CKA); residues indicated are sites of probe incorporation (see text). (C) Spectra of nSH3 with (d2)Gly replacing each of the native Gly residues. Reproduced with permission from ref 24. Copyright 2009 American Chemical Society.
suggesting that interconversion is fast on the NMR timescale and highlighting the advantages of the inherently faster IR timescale. Collectively, the results suggest that different SH3 domains show different levels of flexibility and conformational (and electrostatic) heterogeneity. Interestingly, the biological function of different SH3 domains appears to require different levels of polyspecificity, and because polyspecificity is likely mediated by dynamics and conformational heterogeneity, it will be interesting to determine whether flexibility (rigidity) and conformational heterogeneity (homogeneity) are correlated with more (less) polyspecificity. Such a correlation would suggest that dynamics and conformational heterogeneity contribute to, and were perhaps evolved for, SH3 function. To explore the stability and equilibrium folding of an SH3 domain, we characterized the spectra of the C−D-labeled nSH3 as a function of temperature.63 Nine of 11 positions explored showed an identical transition between the folded and unfolded states with Tm values of 55.6 ± 0.1 °C and ΔG°= 3.4 ± 0.1 kcal mol−1 at 25 °C (Figure 13A). The two exceptions were (d8)Phe143, which showed a slightly higher Tm, likely due to insensitivity to the unfolding transition and sensitivity to a postunfolding thermally induced transition, and (d4)Tyr186, which is solvent-exposed and showed no transition due to similar environments in the folded and unfolded states. The data clearly show that nSH3 (un)folds via a global, single twostate transition. As described in section 5.4, nSH3 has also served as a system to demonstrate the existence of Ni+1−H···Ni backbone Hbonds (Figure 16),40 whose stability appears to explain the otherwise puzzling extent to which the bridge region of the Ramachandran plot is populated. Remarkably, 16 out of 58 amino acids in nSH3 adopt a bridge-region structure consistent
symmetry-related ortho and meta C−D bonds in a homogeneous environment, its spectrum in the folded protein showed unique absorptions for each of the five C−D bonds, suggesting that the degeneracy was lifted by an anisotropic local electric field. A similar effect was also observed with (d3)Ala172, for which a single absorption was observed for the two asymmetric stretches in unfolded protein, but two absorptions, separated by ∼8 cm−1, were observed in the folded protein. While the spectra of (CN)Phe at the three positions explored in the folded protein also showed position-dependent absorption frequencies,63 their temperature dependence (see section 5.1) suggested that the differences are due to differing (and artificial) H-bonding interactions.7 Finally, (N3)Phe incorporated at the same sites also showed position-dependent absorption frequencies,63 but the contributions of electrostatics and H-bonding have not yet been deconvoluted. The SH3 domain from yeast Sho1 membrane protein (Sho1 SH3), which recognizes the PXXP motif of Pbs2 as part of the osmosensory signaling pathway, has also been characterized using transparent window IR probes. Thielges and co-workers introduced (CN)Phe via amber suppression into four positions of the protein and characterized the effects of Pbs2 peptide binding.218 The largest binding-induced shift was observed for (CN)Phe25, which is part of the conserved RT loop and distal from the binding site, suggesting that the shift resulted from secondary effects propagated through conformational changes in other parts of the protein. More recently, these authors introduced (d3)Pro at either of the PXXP Pro positions of the Pbs2 peptide and showed that, while only a single conformation was observed for the free peptide, two conformations were observed upon binding to Sho1 SH3,271 suggesting that recognition may be mediated by an induced-fit mechanism. Interestingly, this heterogeneity was not observed by NMR, AA
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 28. (A) Catalytic cycle of DHFR. (B) Structure of DHFR (PDB ID 1RX2) indicating positions of probe incorporation (see text), as well as NADP+ and FOL. The Met20 loop is shown in green. (C) Detailed view of active site. (D) Spectra of (d4)Tyr100 DHFR in free or variously complexed forms. Reproduced with permission from ref 22. Copyright 2009 Wiley.
residues that appear to engage in Ni+1−H···Ni interactions.359 Consistent with the analysis of nSH3, 72−75% of the identified residues are located in β-turns and loops, while the remainder are located in helices. Thus, it appears that Ni+1−H···Ni Hbonds may be common, that they may make important contributions to protein structure, stability, and dynamics, and that C−D probes are ideal tools for their identification and characterization.
with the formation of such an H-bond. Interestingly, 15 of these residues are located in a β-turn, a loop, or the 310-helix, with only one being located in its β-sheet core, suggesting that they may contribute to the stability of turns and turnlike motifs. While these interactions are virtually never evoked, likely due to the assumed poor H-bond affinity of amide nitrogens, several reports from the literature support the possibility that they are common and possibly important. First, Pohl observed a significant number of these interactions in the high-resolution X-ray crystal structures of myoglobin, lysozyme, α-chymotrypsin, RNase S, and carboxypeptidase A.357 Later, Gieren and co-workers noted a linear correlation between ϕ and ψ angles of four proteins, which suggested that the protein adjusts its structure to preserve the interaction.358 In addition, Karplus presented evidence of the widespread occurrence of these interactions from an analysis of 70 diverse proteins.288 Recently, Deepak and Sankararamakrishnan analyzed 5336 high-resolution protein X-ray crystal structures with resolution ≤1.8 Å, as well as 64 protein structures determined by neutron diffraction with resolution ≤0.9 Å, and identified 8181 examples of Pro
6.3. Dihydrofolate Reductase
The enzyme DHFR, which catalyzes the reduction of 7,8dihydrofolate (DHF) by hydride transfer from a nicotinamide adenine dinucleotide phosphate (NADPH) cofactor to produce 5,6,7,8-tertrahydrofolate (THF), has emerged as a paradigm for understanding the role of protein structure, dynamics, and electrostatics in enzyme catalysis.360−362 In addition, DHFR has attracted attention because its inhibition is the basis of antibiotic (trimethoprim), antiprotozoal (pyrimethamine and proguanil), and antineoplastic (methotrexate, MTX) chemotherapy,363 making it perhaps the most conserved chemotherapeutic target. The E. coli protein is 159-residues in length AB
DOI: 10.1021/acs.chemrev.6b00625 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
and folds into two distinct domains, an α-helical cofactorbinding domain and a β-sheet catalytic domain.361 During catalysis and substrate turnover, the enzyme cycles through five complexes (Figure 28A): the NADPH binary complex (E:NADPH) binds substrate and forms the ternary Michaelis complex (E:NADPH:DHF), within which proton and hydride transfer then yields the E:NADP+:THF ternary product complex, from which NADP+ dissociates to yield the E:THF c o m p l e x , w h i ch t h e n r e l o ad s o x i d i z e d c o f a c t o r (E:NADPH:THF) and, in the rate-limiting step, dissociates THF to regenerate the E:NADPH complex. In addition to ligand binding, the various complexes are differentiated by the conformation of the protein, and most notably by the conformation of a loop formed by residues 9−24 (referred to as the “Met20 loop”), which assumes an occluded conformation that blocks access of the nicotinamide portion of the cofactor to the binding site when THF is bound or a closed conformation where it helps bury the nicotinamide ring in the E:NADPH and E:NADPH:DHF complexes (Figure 28B). To facilitate characterization, several stable complexes have been developed as mimics of the transiently populated species, with the ternary complex with folate (FOL) and NADP+ being used as at stable mimic of the Michaelis complex and the ternary complex with methotrexate (MTX) and NADPH used as a mimic of the transition state.361 To facilitate our initial studies of E. coli DHFR, variants with one or two of the enzyme’s four native Met residues replaced with (d3)Met were expressed, with unlabeled sites mutated to Leu6 (which does not significantly impact catalysis364). The spectrum of (d3)Met1 or (d3)Met42 in the apoenzyme was similar to that of the free amino acid and was independent of cofactor or substrate binding, consistent with their solventexposed positions. However, the absorption bands of (d3)Met16 and (d3)Met20, both located in the Met20 loop, were broadened and red-shifted in the apoenzyme, suggesting that they experience a more apolar and conformationally heterogeneous environment. Upon FOL or NADPH and MTX binding, the (d3)Met20 absorptions showed no significant changes but blue-shifted upon binding NADPH or NADP+ and FOL, suggesting that they are sensitive to the nature of the ligands, but only in the closed conformation. In contrast, binding to FOL, NADPH, NADP+, or MTX induced a significant blueshift and narrowing of the (d3)Met16 absorptions, suggesting that ligand binding generally induces this part of the Met20 loop to adopt a more polar and conformationally homogeneous environment. Hammes-Schiffer, Benkovic, and co-workers studied four environments within DHFR as a function of its catalytic cycle through incorporation of SCN via modification of an introduced Cys residue in a mutant where the two native Cys residues were mutated to Ala or Ser (Figure 28B).58,229 When the SCN probe was used to replace Thr46, which is positioned between the substrate and the cofactor, the largest changes in absorption frequency were observed at steps involving the interconversion of the closed and occluded states: a 4.1 cm−1 red-shift upon proceeding from the pseudo-Michaelis complex to the initial product complex (i.e., between the closed E:NADP+:FOL complex and the occluded E:NADP+:THF complex) and a 5.5 cm−1 blue-shift upon release of THF (i.e., between the occluded E:NADPH:THF complex and the closed E:NADPH complex). Differences between the pseudoMichaelis and the initial product complexes were modeled computationally, the results of which suggested that, while
small changes arise from interactions with Asn18 and Ser49, the majority of the shift arose from displacement of the charged nicotinamide ring from the binding site and changes in the interactions with a proximal water molecule. When the probe was incorporated at position 54 or 28, which are within the FOL binding site, the largest changes were observed upon addition of FOL to the E:NADPH complex (2.4 cm−1 blueshift and narrowing) and between the Michaelis−Menten complex and the initial product complex (2.9 cm−1 red-shift), suggestive of altered electrostatics and rigidity. The shifts of the probe incorporated at position 54 were again modeled in the closed-to-occluded transformation. In this case, the observed change in absorption frequency was smaller (a 1.0 cm−1 redshift), and the computational data suggested that it resulted from the repositioning of the side chain of Arg57. Finally, when the SCN probe was used to replace Met20 in the Met20 loop, significant changes in the frequency were observed between all steps of the catalytic cycle, with the largest shifts observed between the closed and occluded conformations, which was attributed to the changes in H-bonding interactions with solvent. Despite intensive study, the detailed mechanism by which hydride transfer is catalyzed within the E:NADPH:DHF complex has remained controversial. Specifically, while it is believed that the nitrogen of the reduced double bond (N5; Figure 28C) must be protonated, the source of the proton has been debated as no ionizable moiety appeared to be suitably positioned in the active site. In addition, the origin of the elevated pKa, which is elevated from