Quantum Refinement of Protein Structures: Implementation and

Oct 26, 2010 - system in DsRed is thus extended and gives rise to a red shift ... protein topology files from the CNS and the CHARMM libraries. This m...
0 downloads 0 Views 4MB Size
J. Phys. Chem. B 2010, 114, 15413–15423

15413

Quantum Refinement of Protein Structures: Implementation and Application to the Red Fluorescent Protein DsRed.M1 Ya-Wen Hsiao, Elsa Sanchez-Garcia, Markus Doerr, and Walter Thiel* Max-Planck-Institut fu¨r Kohlenforschung, Kaiser-Wilhelm-Platz 1, D-45470, Mu¨lheim an der Ruhr, Germany ReceiVed: August 26, 2010; ReVised Manuscript ReceiVed: September 30, 2010

Quantum refinement is an improvement upon the molecular mechanics (MM)-based crystallographic refinement. In the latter, X-ray data are supplemented with additional chemical information through MM force fields, whereas quantum refinement describes crucial regions of interest in the macromolecule by quantum mechanics (QM) instead of MM. In this paper, we report the implementation of quantum refinement in the ChemShell QM/MM framework and its application in an investigation of the chromophore structure of the red fluorescent protein DsRed.M1. Both mechanical and electrostatic QM/MM embedding schemes are implemented and tested. In the quantum refinement of DsRed.M1, the anionic red acylimine chromophore adopts a nearly orthogonal arrangement (rather than a cis or trans form), and the bond lengths in the acylimine moiety are more consistent with a phenolate (rather than a quinoid) structure. These findings are in contrast to the structure deduced from a standard crystallographic refinement (PDB: 2VAD), but in agreement with our earlier results from a purely theoretical QM/MM study. On the other hand, the quantum refinement of the anionic acylimine form of DsRed.M1 yields a hydrogen bonding network around the chromophore, especially with regard to the arrangement of the water molecules and the Glu148 residue, that is closer to the 2VAD structure than to the previously optimized QM/MM structure. In our earlier study the initial classical molecular dynamics (MD) simulations during QM/MM setup apparently exaggerated the mobility of the water molecules around the chromophore. On the basis of the present results, it seems likely that the Glu148 residue is protonated in the DsRed.M1 protein. The calculation of electronic excitation energies allows for further assessment of the proposed structures, especially in the chromophore region. Using a combination of density functional theory and multireference configuration interaction (DFT/MRCI), we find excellent agreement between experiment and theory only for the structures obtained from quantum refinement and from QM/MM optimization, but not for the 2VAD structure. The present case study on DsRed.M1 thus demonstrates the merits of combining reliable theoretical and experimental information in the determination of protein structures. Introduction 1

Protein coordinates deposited at the Protein Data Bank (PDB) are the results of crystallographic model-building and refinement processes. In the standard X-ray crystallographic refinement of macromolecules, owing to the limited resolution, it is difficult to fully determine the total structure in a purely experimental manner. Additional chemical information is often supplemented in the form of a molecular mechanics (MM) force field. The energy function of such an MM-based refinement is given by:

Etotal ) EMM + wxrefExref

(1)

where EMM is the MM total energy and Exref is a penalty function indicating the deviation from the raw data compared to the structure factors calculated from the model. The exact evaluation of Exref varies depending on the refinement program. The weight factor wxref determines the relative importance of the two terms in eq 1 and also serves to make EMM and Exref compatible, since EMM is normally given in energy units while Exref is not. Consequently, the refined crystal structures will to some extent depend on the parametrization of the force field employed. This is normally of little concern for standard amino acids or * To whom correspondence should be addressed. E-mail: thiel@ mpi-muelheim.mpg.de.

nucleic acids where reliable force fields are available. However, this is often not the case for heterocompounds such as substrates, inhibitors, or metal centers in proteins. The quantum refinement developed by Ryde et al.2 offers a remedy. It follows the same basic strategy as the conventional X-ray crystallographic refinement, with the proviso that the special sites of interest in protein structures are characterized by quantum mechanics (QM) instead of an MM force field. The structure as a whole is still subject to the X-ray data restraints. This approach is in the spirit of the QM/MM method,3 and the energy function of quantum refinement can thus be written in terms of a restrained QM/MM method:

Etotal ) EQM/MM + wxrefExref

(2)

Quantum refinement has also been developed for NMR4 and extended X-ray adsorption fine structure (EXAFS)5-7 spectroscopy and hence has a wider scope to interpret experimental data. Merz et al. have proposed an analogous QM/MM X-ray structure refinement8 and applied it in several studies addressing, for example, the assignment of Asp protonation states in β-secretase,9 the conformations of benzamidinium inhibitors in protein/ ligand complexes,10 and the structure of zinc metalloenzymes.11 The approach of Ryde et al. makes use of density functional theory (DFT) for the QM calculations and of the Crystallography

10.1021/jp108095n  2010 American Chemical Society Published on Web 10/26/2010

15414

J. Phys. Chem. B, Vol. 114, No. 46, 2010

Figure 1. Simplified scheme representing the formation and maturation of the chromophore in the DsRed protein. Shown are the neutral transpeptide chromophore and the anionic cis-acylimine chromophore in its quinoid and phenolate forms.

and NMR system (CNS)12 for evaluating EMM and Exref, whereas Merz et al. employ semiempirical or ab initio electronic structure methods for the QM region, the AMBER force field13 for modeling the protein, and CNS for the target function Exref. In this paper, we report the implementation of the quantum refinement functionality in the ChemShell QM/MM software,14,15 both for mechanical and electrostatic embedding.3 We use Turbomole16 and DL_POLY17 for the QM and MM calculations, respectively, and CNS for evaluating the target function Exref and the corresponding gradient. The MM force field parameters for the protein are taken from CHARMM22.18 As a first test to the implementation of quantum refinement in ChemShell, we apply it to investigate the structure of the DsRed.M1 protein. DsRed is a red fluorescent protein from the Discosoma coral. The chromophore is formed autocatalytically from the Gln66Tyr67-Gly68 peptide in the presence of molecular oxygen. It is assumed that the reaction starts with a green fluorescent protein-like chromophore with a trans-peptide bond between Phe65 and Gln66, followed by a maturation step that yields the mature DsRed chromophore (Crq66) with a cis-acylimine linkage between Phe65 and Gln66. The conjugation of the system in DsRed is thus extended and gives rise to a red shift to the absorption energy.19-22 Figure 1 summarizes the formation and maturation of the chromophore in the DsRed protein. The maturation, however, is often slow and incomplete, so that a mixture of green and red chromophores is formed.19,21,23 In addition, due to the tendency to form tetramers,19 the applicability of the wild type DsRed is limited. Therefore, genetic modifications have been introduced to search for a highperformance monomeric red fluorescent protein. DsRed.M1 is a fast maturing monomeric variant of DsRed with similar spectroscopic properties but much lower fluorescence intensity. On the basis of the electron density and the fluorescence spectral properties, dual partial occupancy conformations have been assigned23 to the chromophore in the DsRed.M1 structure (PDB code 2VAD): the cis-conformation with sp2 hybridization at the R carbon of Gln66 (CA1 according to the 2VAD nomenclature) results in the red form, while the green trans-conformer has sp3 hybridization at CA1 of Gln66. The excitation maximum is at 2.23 eV for the cis-conformer,

Hsiao et al. which is responsible for the red fluorescence, while the transconformer has an excitation maximum at 2.58 eV.24 Our previous QM/MM study of DsRed.M125 differs from the 2VAD model in the cis-acylimine structure connecting Gln66 and Phe65 and in a few hydrogen bonding interactions in the immediate neighborhood of the chromophore. In this work, we further address these structural differences by applying quantum refinement, where the X-ray raw data provide restraints that may help to avoid errors in the optimization arising from the inherent limitations of the chosen theoretical approach (e.g., limited accuracy of DFT functional and MM force fields). The transpeptide structure is also studied by quantum refinement although there are only minor differences between the 2VAD and the QM/MM structures in this case. Conventional X-ray crystallographic refinements are generally performed without accounting for electrostatic interactions which corresponds to mechanical embedding in QM/MM terminology. Experience shows that the use of explicit electrostatics in standard X-ray refinements can lead to biased models,26 especially if the MM force fields are not well parametrized. Since QM/MM calculations on proteins normally employ electrostatic embedding,3 we have performed quantum refinements on DsRed.M1 using both mechanical and electrostatic embedding, to assess the relative merits of these two approaches. Methods Implementation in ChemShell. The implementation of quantum refinement in ChemShell takes advantage of the existing QM/MM functionality of ChemShell. The covalent bonds between the QM region and the remainder of the protein are replaced by hydrogen atoms to truncate the QM system, and the standard QM/MM boundary schemes available in ChemShell are used. For a given QM/MM setup, ChemShell provides the QM/MM energy EQM/MM and its gradient with respect to the nuclear coordinates. Quantum refinement via ChemShell requires calls to CNS to calculate the target function Exref and its gradient. All MM energy terms are excluded in the CNS calculation, since they are evaluated by the MM program that is called from ChemShell in the standard QM/MM calculation. EQM/MM and Exref are combined according to eq 2. The gradients of EQM/MM and Exref are combined in an analogous manner. Technically, this is done by separating the gradient of Exref from CNS into contributions that refer to the QM and MM atoms (making use of the appropriate ChemShell information) and by adding these contributions to the QM and MM gradients from the QM/MM calculation that are internally stored in separate arrays. Atoms can be held fixed during the refinement by setting the corresponding gradient components to zero (again making use of the available ChemShell control variables). In each iteration of the refinement, the coordinates are then updated according to the combined gradients, and new input files are generated for the next QM, MM, and CNS calculations. This procedure is repeated until the desired convergence criteria are satisfied. The atomic order for some protein residues in the CNS topology files differs from the conventions in CHARMM. Therefore, a mapping table between CHARMM and CNS is constructed at the beginning of the refinement based on the protein topology files from the CNS and the CHARMM libraries. This mapping is applied in each iteration to generate a proper input coordinate file for CNS. Computational Details. The DsRed protein structure and the X-ray structure factors were obtained from the PDB (PDB code

Quantum Refinement of Protein Structures

J. Phys. Chem. B, Vol. 114, No. 46, 2010 15415

2VAD, resolution 1.59 Å). On the basis of the evidence from the electron density and from spectroscopic properties, two conformers have been assigned23 for the chromophore in 2VAD: a trans-peptide (the green form, ACRQ in 2VAD) and a cisacylimine (the red form, BCRQ in 2VAD). Protein coordinates containing either of these conformers were supplemented with hydrogen atoms using the HBUILD module in CHARMM27 in the same way as described in our previous QM/MM study on DsRed.25 The resulting structures were then investigated separately. Only the anionic state of the chromophore was considered. Hydrogen positions were relaxed with 10 000 steps of the steepest descent algorithm followed by 10 000 steps of the adopted basis Newton-Raphson algorithm in CHARMM, while all other heavy atoms were fixed. Due to the modular architecture of ChemShell, the quantum refinement can be carried out with a wide range of QM and MM treatments, the choice being based on the usual compromise between the desired accuracy and the computational effort. In the present case, Turbomole16 was used to evaluate the QM energy at the DFT level using the BP86 functional28,29 and the 6-31G* basis set.30 This combination has been successfully applied in several studies by Ryde et al.2,4-7,31 It typically reproduces experimental bond lengths within 0.02 Å for organic molecules. The QM region consisted of the chromophore and its neighbors Phe65 and Ser69. DL_POLY17 was used to evaluate MM energies for the complete protein structure with hydrogen atoms added, while CNS was used to compute Exref for the structure without hydrogen atoms. The MM force field parameters for the protein were taken from CHARMM22.18 Starting from the CNS input template minimize.inp (http:// cns.csb.yale.edu/v1.2/), molecule-specific data were included into the input file: file names of the topology and parameter data, space group and unit cell size, structure-factor file name, and resolution range. The minimization was turned off in CNS by setting the minimization step to zero, since the minimization of the target function, eq 2, was done by one of the optimizers available in ChemShell. The B factors were held fixed at the values in the PDB file, so that model errors from the coordinates would not be absorbed by the fitting of the B factors. The R factor gives an indication as to how well the model structure reproduces the experimental structure factors. It is defined as

R)

∑ ||Fo(hkl)|-|Fc(hkl)|| hkl

∑ |Fo(hkl)|

(3)

hkl

where Fo denotes the experimental structure factor amplitudes, Fc are the calculated amplitudes of the proposed atomic model, and hkl are the indices of the reciprocal lattice points of the crystal. To avoid overfitting of the diffraction data, a crossvalidation is provided by the Rfree factor,32 which is obtained from a set of randomly chosen reflections (typically 5% of the data) that are set aside from the beginning and are hence not included in the refinement. Ideally, Rfree should be equal to R, but in practice Rfree is normally larger. Following the original experimental work,23 5.1% of the reflection data were used for cross validation to obtain Rfree. For the structure containing the cis-conformer, CNS gives an R factor of 0.1959 and an Rfree of 0.2127: in both cases the values differ by less than 2% from the reported values of 0.1771 and 0.197123 obtained using REFMAC.33 For the structure

containing the trans-conformer, R and Rfree from CNS are 0.1958 and 0.2122, respectively, which again lie within 2% of the above-mentioned REFMAC values. Such differences of up to 2% between two crystallographic refinement programs are generally considered acceptable. The refinement target function was based on the maximum likelihood using amplitudes (MLF), so the CNS definition of Exref is

Exref ) EMLF )

∑ σ 1 2 (|Fo(hkl)|-〈|Fo(hkl)|〉)2 hkl

(4)

ML

where the expected value of 〈|Fo(hkl)|〉 and the corresponding variance σML are derived from the observed Fo value, the calculated Fc value, and a cross-validating value σA which quantifies the effects of model errors.26 The MLF method produces less model bias than the standard least-squares residual, because it prevents fitting noise in the data or accumulating systematic errors in the model, and in addition, it requires no prior phase information.26 CNS reads PDB coordinates as input which contain only three decimal digits by default. As suggested by Ryde et al.,2 we have chosen to work with six decimal digits in the ChemShell implementation to avoid undue numerical truncations. In quantum refinement with mechanical embedding,34 the electrostatic interactions are turned off in the MM program, and no MM point charges are included in the QM calculation. In quantum refinement with electrostatic embedding, the usual QM/ MM setup is applied.25 In brief, the protein is solvated in a sphere of TIP3P water molecules35 of a radius of 30 Å. Using the CHARMM program (version 31b1), the water molecules are equilibrated, and the system is prepared by going through cycles of rehydration and constant temperature molecular dynamics (MD) as described previously,25 except that the protein coordinates are not allowed to change in the present setup. Electrostatic interactions are included in the MM force field by using point charges from the CHARMM parametrization. In the QM region, the fixed MM charges are included into the one-electron part of the QM Hamiltonian, and the electrostatic QM/MM interactions are evaluated from the QM electrostatic potential and the MM atomic charges.34 The charge-shift scheme36,37 is applied to prevent overpolarization at the QM/ MM boundary. No cutoff for electrostatic interactions is used. The structure determination is considered converged when the maximum gradient component is less than 0.004 au (≈ 20 kJ mol-1 Å-1). When tightening this criterion by a factor of 10 to 0.0004 au the QM energy of the final optimized structure changed by less than 1 kcal/mol in QM/MM test calculations on cis-acylimine in which only the QM region was relaxed. The best structure is determined in quantum refinement by a grid scan of wxref in eq 2, normally in steps of 0.1 between 0 and 1. The structure that has the lowest Rfree factor (and the smallest |Rfree - R| if necessary) is chosen. Using a criterion based on the Rfree factor has been suggested by Brunger et al.26,32,38 To investigate the effects of applying different X-ray restraints, quantum refinement calculations were performed with two protocols: (i) optimizing only the quantum region with the surrounding protein being fixed, and (ii) optimizing all atoms in the protein; in this case, the outer TIP3P water sphere that is added for electrostatic embedding is held fixed. The program VMD39 was used to visualize the protein structures and to calculate the root-mean-square deviations

15416

J. Phys. Chem. B, Vol. 114, No. 46, 2010

Hsiao et al.

TABLE 1: Refinement of the Anionic cis-Acylimine Form: Lowest Rfree Values, Corresponding wxref Weights, Lowest πfπ* Transition Energies, and Associated Oscillator Strengths (DFT/MRCI) from All Protocols (See Text); Eexp ) 2.23 eV protocol

wxref

Rfree

Eπfπ* (eV)

osc. str.

2VAD (cis-acylimine) nonelectrostatic (i) nonelectrostatic (ii) electrostatic (i) electrostatic (ii) QM/MM (i) QM/MM (ii)

0.1 0.9 0.0 0.9 0.0 0.0

0.2127 0.2122 0.2146 0.2121 0.2147 0.2121 0.2699

1.88 2.16 2.13 2.25 2.21 2.25 2.22

0.371 1.134 1.048 1.214 1.227 1.214 1.209

(rmsd) between different structures. For protocol i and ii, the rmsd was computed without and with structural alignment, respectively. DFT/MRCI Calculations. Vertical excitation energies for given structures were calculated using the density functional multireference configuration interaction (DFT/MRCI) method of Grimme and Waletzke.40 Using standard parameters, the typical error in DFT/MRCI excitation energies is around 0.2 eV compared to the experiment.40 Taking high-level ab initio results as the reference, the standard DFT/MRCI approach typically underestimates excitation energies by 0.2 eV, as shown in a recent benchmark study.41 The QM region in the DFT/MRCI calculations consisted of the chromophore Crq66, parts of residues Phe65 (atoms C, O, CA) and Ser69 (atom N), and the required hydrogen link atoms. CHARMM atomic charges were included as external point charges in the QM calculations to account for the electrostatic QM/MM interactions. The required DFT wave functions were generated using the BH-LYP functional42 and the TZVP basis set43,44 from the Turbomole basis set library. All valence electrons (identified by their orbital energies) were correlated. For the required DFT/MRCI parameters,40 we used a set developed in the Grimme group45 which speeds up the calculations without sacrificing accuracy.46 Ten CI roots with singlet multiplicity were calculated in each case. The MRCI reference space was determined iteratively by first performing a DFT/ MRCI calculation with a configuration selection threshold δEsel ) 0.6 au above the highest reference space energy. The initial reference space included all configurations that can be generated by up to doubly exciting 10 electrons within the 10 frontier orbitals. In the next iteration δEsel was increased to 0.8 au, and all configurations with a squared coefficient of at least 0.003 in the previous CI calculation were included in the reference space. Further iterations were performed until the energies of all calculated roots were converged to 10-6 au. Prior to the spectroscopic investigation of the original 2VAD structure (PDB) by DFT/MRCI, the following preparations were done: hydrogen atoms were added and relaxed using the program CHARMM and then further optimized at the presently applied QM/MM level, that is, using BP86/6-31G* and CHARMM22, with the conventions described above. All atoms from the 2VAD structure were kept frozen. Results and Discussion Refining DsRed.M1 Containing the cis-Acylimine Chromophore. The lowest Rfree values and the corresponding wxref weights from all the refinement protocols are collected in Table 1, together with the lowest computed πfπ* transition energies and oscillator strengths. The starting structure for all refinements was the cis-acylimine conformer from the 2VAD model.

Figure 2. Overlays of cis-acylimine chromophore structures (Crq66, Phe65, and Ser69) from protocol i: top, 2VAD (green) and nonelectrostatic refinement (pink); middle, 2VAD (green) and electrostatic refinement (purple); bottom, nonelectrostatic and electrostatic refinement.

Protocol i. Protocol i allows for structure optimization only within the QM region, with the rest of the protein being fixed in space, and thus focuses on the structural changes at the site containing the chromophore. Figure 2 shows the overlays between the chromophore structures from 2VAD and from the nonelectrostatic and electrostatic refinements. In the nonelectrostatic refinement (mechanical embedding), the rmsd for the heavy atoms of the three residues Phe65-Crq66Ser69 is 0.19 Å between the final and the initial structures. The two largest atomic movements occur for O of Phe65, 0.58 Å, and N of Crq66, 0.54 Å. They are accompanied by large changes in the dihedral angles: CA1sNsCsO from 171° to 106° and C1sCA1sNsC from 110° to 169°. In the overlay of 2VAD and the nonelectrostatic refined structure (top panel of Figure 2), there are clear structural changes in the acylimine moiety and in the side chain of Gln66. Bond lengths in this region (Table 2) are more consistent with a phenolate: longer C1sCA1 and NsC bonds and a shorter CA1sN bond. In the electrostatic refinement, a standard QM/MM calculation (i.e., with wxref ) 0 and with electrostatics included) turns out to have the lowest Rfree value and thus represents the best solution in this protocol. The rmsd for the heavy atoms of the three residues Phe65-Crq66-Ser69 is 0.25 Å between the final and the initial structures. As in the nonelectrostatic refinement, the largest atomic movements are found for O of Phe65: 0.77 Å, and N of Crq66: 0.70 Å. They change the dihedral angles CA1sNsCsO to 87° and C1sCA1sNsC to 177°. The middle panel of Figure 2 illustrates the shift in position for these atoms (N and O). The bond lengths are again more consistent with a phenolate and differ by less than 0.03 Å from those of the nonelectrostatic refinement. The outcomes of the two different quantum refinements are quite similar. As shown in the lower panel of Figure 2, the two

Quantum Refinement of Protein Structures

J. Phys. Chem. B, Vol. 114, No. 46, 2010 15417

TABLE 2: Comparison of Selected Bond Lengths (Å) and Torsion Angles (deg) in the cis-Acylimine Chromophore: 2VAD Literature Values,23 Results from the Present Nonelectrostatic and Electrostatic Quantum Refinements Using Protocols i and ii, and Values from the Present QM/MM Optimizations with These Protocols (See Text) protocol 2VAD nonelectrostatic (i) nonelectrostatic (ii) electrostatic (i)a QM/MM (i)a electrostatic (ii) QM/MM (ii)

nonelectrostatic (ii) electrostatic (i)a QM/MM (i)a electrostatic (ii) QM/MM (ii) a

CA1sN

NsC

CdO

1.388 1.443 1.445 1.434 1.446 1.459 1.459 1.445 1.455

1.324 1.307 1.310 1.300 1.310 1.303 1.303 1.297 1.305

1.314 1.360 1.369 1.353 1.366 1.385 1.385 1.362 1.381

1.222 1.243 1.242 1.235 1.242 1.237 1.237 1.230 1.238

with restraint no restraint with restraint no restraint with restraint no restraint with restraint no restraint

protocol 2VAD nonelectrostatic (i)

C1sCA1

with restraint no restraint with restraint no restraint with restraint no restraint with restraint no restraint

CA1sNsCsO

C1sCA1sNsC

CD2sCG2sCA2sN2

171.2 105.5 94.5 113.2 93.6 86.7 86.7 106.2 91.5

110.2 169.2 173.0 167.4 172.2 176.9 176.9 170.4 174.1

1.8 0.6 7.2 1.4 10.7 9.8 9.8 0.4 8.0

Identical results since the optimum weight factor in the electrostatic quantum refinement using protocol i is zero (see text).

TABLE 3: Comparison of Interatomic Distances (Å) in the Hydrogen Bonding Network around the cis-Acylimine Chromophore: 2VAD Literature Values,23 Results from the Present Nonelectrostatic (I) and Electrostatic (II) Quantum Refinement and from the Present QM/MM Calculations, and Values from Our Previous QM/MM Study25 a Glu148refinement

Glu148

QM/MM

QM/MM

refinementb

QM/MM

model

2VAD

I

II

present

ref 22

I

II

present

OG Ser146-OH Crq O W42-OH Crq NE2 His163-OH Crq NH2 Arg95-O2 Crq O W41-O Crq O W79-O Crq NE2 Gln213-OE1 Crq OE1 Gln42-NE1 Crq N Glu215-O Gln42 OE2 Glu215-O W44 OE2 Glu215-O W138 OE1 Glu215-N2 Crq OE1 Glu148-O W123 OG Ser197-O W44 OG Ser197-O W123 NZ Lys70-O W44 NZ Lys70-OE1 Glu148 MADc

2.57 2.66 2.83 2.93 2.93 2.83 2.79 3.33 2.94 2.79 2.59 2.75 4.95 2.83 2.85 2.59 3.57 -

2.63 2.65 2.85 2.93 2.90 2.80 2.80 3.31 2.96 2.72 2.63 2.80 4.89 2.88 2.86 2.83 3.40 0.05

2.59 2.64 2.82 2.89 2.89 2.81 2.79 3.30 2.97 2.77 2.61 2.73 4.90 2.88 2.81 2.78 3.24 0.05

2.66 2.69 2.78 2.71 2.77 2.70 2.85 2.94 2.90 2.92 2.79 2.76 4.64 2.75 2.72 2.80 2.60 0.19

2.87 2.79 2.82 2.71 2.75 2.77 2.85 3.07 2.96 3.32 2.79 2.93 3.04 2.84 2.84 2.87 2.64 0.31

2.64 2.68 2.85 2.93 2.90 2.80 2.81 3.29 2.96 2.73 2.66 2.84 4.85 2.92 2.89 2.86 3.37 0.06

2.58 2.75 2.80 2.85 2.91 2.78 2.81 3.27 2.96 2.73 2.60 2.77 4.91 2.82 2.80 2.68 3.47 0.04

2.65 3.42 2.83 2.72 2.79 2.70 2.88 2.98 2.90 2.74 2.74 2.83 4.84 2.75 2.73 2.70 3.33 0.16

a Separate evaluations were done using an anionic and neutral Glu148 residue (see text). Boldface values differ by more than 0.1 Å from 2VAD. b wxref ) 0.5. c Mean absolute deviation from the 2VAD values.

structures are almost identical, except for the amide tail of Gln66 and the carbonyl oxygen of Phe65. The largest differences occur in Gln66 where NE1 (CB1) moves the most in the nonelectrostatic (electrostatic) refinement, by 0.35 (0.32) Å. Protocol ii. In protocol ii the positions of all atoms in the system are refined using QM + X-ray forces for the QM atoms and MM + X-ray forces for the MM atoms. In the nonelectrostatic quantum refinement, the rmsd for the heavy atoms of the three residues Phe65-Crq66-Ser69 is 0.13 Å between the final and the initial structures, even smaller than in protocol i. The atoms of the chromophore with the largest change in position are the same as before: atoms N of Crq66 and O of Phe65 move by 0.48 and 0.37 Å, respectively, which causes changes in the dihedral angles CA1sNsCsO to 113°

and C1sCA1sNsC to 167°. The characteristic bond lengths in the chromophore again indicate a phenolate structure; they differ by less than 0.01 Å from those of protocol i. Around the chromophore (within 5 Å), the largest displacements caused by the nonelectrostatic refinement involve atom NZ of Lys70 (0.2 Å) and the oxygen atom of crystal water W44 (0.1 Å). The only hydrogen bond distances (see Table 3) that differ substantially from 2VAD (by more than 0.1 Å) are those from atom NZ of Lys70 to atoms OE1 of Glu148 (0.17 Å), O of W44 (0.24 Å), and CB2 of Crq66 (0.16 Å). These differences are visualized in Figure 3, where one can see the overall similarity between the structures from quantum refinement and the 2VAD model as well as the discrepancies with regard to the QM/MM structure from our previous work.25

15418

J. Phys. Chem. B, Vol. 114, No. 46, 2010

Hsiao et al.

Figure 3. Overlays of cis-acylimine structures (chromophore Crq66, Lys70, and Glu148) from protocol ii: 2VAD (green), nonelectrostatic refinement (pink), electrostatic refinement (purple), and QM/MM optimization (orange) with Glu148- residue.

Further away from the chromophore, there are atomic displacements of more than 0.5 Å, for example, OE2 of Glu39. Glu39 is a surface residue located on a loop, and its side chain extends outward away from the protein β-sheet barrels, where the X-ray reflection data contain more uncertainty since surface residues can assume many conformations. Therefore, it does not seem meaningful to compare the exact positions of their side chains. In the electrostatic refinement, the largest displacements in the chromophore are again found for N of Crq66, 0.51 Å, and O of Phe65, 0.39 Å. The dihedral angles CA1sNsCsO and C1sCA1sNsC change to 106° and 170°, respectively, while CD2sCG2sCA2sN2 remains small and practically unchanged. The chromophore bond lengths again indicate a phenolate structure. Overall, the chromophore structures from the nonelectrostatic and electrostatic quantum refinements are very similar, which is evident from the overlay in the upper panel of Figure 4. In the immediate neighborhood of the chromophore, NZ of Lys70 moves by 0.33 Å in the electrostatic refinement. Lys70 is located close to the chromophore Crq66, and its side chain is flexible to adopt multiple conformations. The motion of NZ increases its distance to Crq66 slightly, as shown in Figure 3, and also affects the distances W44 and Glu148 (see bottom of Table 3), similar as in the nonelectrostatic refinement. The overall structures obtained via protocol ii can be compared in terms of the rmsd of the heavy atoms in the protein. We find that the results are essentially independent of the chosen weights wxref. For structure-0.1 (wxref ) 0.1) and structure-0.9 (wxref ) 0.9) from the electrostatic (nonelectrostatic) refinement, the rmsd is 0.06 (0.05) Å for backbone heavy atoms and 0.10 (0.08) Å for all heavy atoms. In the chromophore moiety, these structures are also quite alike, except for a minor shift in the position of the carbonyl oxygen of Phe65 and the associate dihedral angle CA1sNsCsO (from 99° to 106°) that is observed in the electrostatic refinement. The overall protein structures from the electrostatic and the nonelectrostatic refinement are very similar to each other and also to the 2VAD structure, as indicated by rmsd values for the backbone atoms of 0.04-0.05 Å. It thus appears that the treatment of QM/MM and MM electrostatics (mechanical vs electrostatic embedding) has only a minor influence on protein structures obtained from quantum refinement, presumably because of the presence of X-ray restraints in this procedure.

Figure 4. Overlays of cis-acylimine chromophore structures (Crq66, Phe65, and Ser69) from protocol ii: 2VAD (green), nonelectrostatic refinement (pink), and electrostatic refinement (purple).

Comparison with Previous QM/MM Results. Our previous QM/MM study of DsRed.M125 arrived at a structure of the cisacylimine conformer that differs in several aspects from the 2VAD structure.23 1. The acylimine moiety has a phenolate (QM/MM) instead of a quinoid (2VAD) structure. 2. In the chromophore, the CdO bond is nearly perpendicular to the CA1sNsC plane (QM/MM) rather than coplanar (2VAD), with a dihedral angle CA1sNsCsO close to 90° rather than around 180°. Furthermore, the C1sCA1sNsC moiety is almost planar (QM/MM). 3. In the hydrogen bond and salt bridge network next to the chromophore, there are substantial differences in the positions of some of the residues, especially with regard to atoms NZ of Lys70 and OE1 of Glu148 as well as crystal waters W44 and W123. The latter two are found to be highly mobile in the classical MD simulations performed previously.25 In the following, we discuss these differences in the light of our present results from quantum refinement. We note at the outset that our previous and present work employs QM/MM calculations of a similar kind for structure determination, but there is one important distinction. In our earlier purely theoretical study,25 the setup of the system involved unconstrained classical MD simulations prior to the QM/MM geometry optimizations which led to some rearrangement of mobile residues in the region around the chromophore. In the present work, the protein atoms from the 2VAD structure were held fixed during system setup so that the optimizations during quantum refinement started essentially from the 2VAD geometry. Concerning the differences in chromophore bond lengths (see point 1 above) there are two limiting structures for the cis-acylimine moiety that correspond to the phenolate and quinoid resonance forms. Contrary to the 2VAD assignment and consistent with our previous QM/MM results, quantum refinement favors the phenolate form. Table 2 contains the chromophore bond lengths obtained from quantum refinement

Quantum Refinement of Protein Structures protocols i and ii as well as the data from the present QM/MM calculations without X-ray restraints (using either mechanical or electrostatic embedding). The bond lengths from protocol i with X-ray restraints are nearly identical to those from calculations without restraints, while there are minor differences of typically 0.01 Å for protocol ii. In all four sets of results (Table 2), the C1sCA1 and NsC bonds are longer than in the 2VAD structure by 0.04 Å or more, whereas the CA1sN bond is shorter by 0.02 Å. Hence, regardless of the inclusion of the X-ray constraints or not, all present results differ significantly from the 2VAD assignment and indicate a phenolate structure in the chromophore. We conclude that medium resolution X-ray data by themselves are not able to discern bond differences of less than 0.05 Å in our system. Therefore, this is an example where the inclusion of reliable QM information (here at the BP86/6-31G* level of DFT) in the quantum refinement is essential for the determination of the chromophore structure. Concerning the dihedral angles (see point 2 above), our present results (Table 2) also agree with those from our previous QM/MM study25 and are incompatible with 2VAD. All quantum refinements and all QM/MM calculations confirm that the CdO bond of the acylimine is almost perpendicular to the CA1sNsC plane (CA1sNsCsO in the range 87-113°, compared with 171° in 2VAD). We note in this context that a dihedral angle of 51° has been found in the case of methyleneformamide (CH2NCHO) which turns planar upon protonation of nitrogen, indicating that the lone pair on nitrogen can compete with π conjugation for the interaction with the carbonyl group.47 Protocol i gives basically the same results for the torsion angle C1sCA1sNsC (169-173°, see Table 2) in all calculations (with or without restraints, nonelectrostatic or electrostatic refinement), while the scatter is somewhat larger for the torsion angles CA1sNsCsO and CD2sCG2sCA2sN2 where the differences between the different quantum refinements range up to 19° and 9°, respectively. For protocol ii, the dihedral angle C1sCA1sNsC is again always close to 180° and thus clearly different from the 2VAD value of 110°, whereas the dihedral angles CA1sNsCsO (CD2sCG2sCA2sN2) from the QM/ MM optimizations are generally somewhat smaller (larger) than those from the quantum refinements. As a further check, we performed a simulated annealing refinement using CNS and the raw data from the PDB (2VAD) to obtain the CNS-refined cis-acylimine structure. Starting from the CNS input file template (anneal.inp), the torsion-angle MD and slow-cooling annealing schemes were chosen for the refinement. Default values were generally used for the CNS input parameters, except that we did not vary the B factors to be consistent with the other calculations of this work. A total of 100 trials of different initial velocities was done with MLF as the refinement target. wxref was determined automatically by the CNS program during the MD run, which led to weights between 0.3 and 0.4. For the structure with the lowest Rfree value, the rmsd for all heavy atoms was 0.16 Å compared to 2VAD. Its torsion angles are CA1sNsCsO ) 104°, C1sCA1sNsC ) 168°, and CD2sCG2sCA2sN2 ) 2°. These values are consistent with those obtained by the present quantum refinements and QM/MM calculations. The CNS-refined bond lengths are, however, close to the original 2VAD assignment, which is as expected, since the X-ray raw data can only discern bond length differences larger than 0.05 Å at this resolution, and the CNS MM parameters are not optimized for this purpose. The results from the CNS simulated annealing thus support the revised dihedral angles of the cis-acylimine chromophore.

J. Phys. Chem. B, Vol. 114, No. 46, 2010 15419 Concerning the hydrogen bond and salt bridge network next to the chromophore (see point 3 above), the present results from quantum refinement are rather close to the 2VAD assignments, while there are substantial differences from the present QM/ MM results and even larger ones from the previous QM/MM data.25 Table 3 lists selected relevant interatomic distances in this region (for the set of atom pairs covered in Table 1 of our earlier QM/MM study25). Boldface numbers indicate differences of more than 0.1 Å from the 2VAD model (encountered in 2 out of 17 cases for quantum refinement and in at least 10 cases for QM/MM). Compared with the 2VAD model, the mean absolute deviations (MADs) of these distances are around 0.05 Å for the quantum refinement structures, around 0.2 Å for the present QM/MM structures and around 0.3 Å for the previous QM/MM optimizations where the starting geometry had been taken from an MD snapshot.25 In the structures obtained from quantum refinement, only distances involving Lys70 differ considerably from the 2VAD model. Lys70 interacts with the buried charges of Glu215 and Glu148 and influences the final maturation step of DsRed,19 and the determination of its position with respect to other residues may thus be important for understanding the maturation mechanism. The NZ Lys70-O W44 distance is increased from 2.59 Å in 2VAD to 2.7-2.9 Å both in the quantum refinements and the QM/MM calculations (Table 3). On the other hand, the NZ Lys70-OE1 Glu148 distance is always smaller than the 2VAD value of 3.57 Å: the decrease is moderate (0.2-0.3 Å) both in the nonelectrostatic and the electrostatic quantum refinement structures which are quite similar to each other (see Figure 3), but much more pronounced in the QM/MM geometry optimizations with a Glu148- residue that give short distances of 2.60-2.64 Å in the present and previous25 work. Intuitively, one would indeed expect that the flexible side chains of Lys70+ and Glu148- would move toward each other to form a stable salt bridge (as in the QM/MM calculations). According to the 2VAD and the quantum refinements, this is not the case, and it thus seems likely that Glu148 is actually protonated at OE1 in the crystal so that the electrostatic interactions are weakened and the distance to NZ of Lys70 remains large. To test this hypothesis, we have carried out quantum refinements as well as QM/MM calculations with a protonated Glu148 residue which give NZ Lys70-OE1 Glu148 distances of 3.33-3.47 Å, similar to the quantum refinements with Glu148- and only slightly shorter than the 2VAD value. This example demonstrates that the QM/MM optimized structures can be (and generally will be) quite sensitive to the assumed protonation states of titratable residues such as Glu, whereas quantum refinement is much more robust in such a situation because the restraints from the X-ray data are apparently sufficient to maintain the correct distance. Therefore, quantum refinement with proper X-ray restraints seems capable of producing realistic structures even if the chosen starting model contains residues with an unrealistic protonation state. By the same token, the combined use of quantum refinement and QM/MM calculations can help in the correct assignment of protonation states. The largest discrepancy between the 2VAD structure and our previous QM/MM structure25 concerns the position of W123, as indicated by the OE1 Glu148-O W123 distances of 4.95 and 3.04 Å, respectively (Table 3). All present quantum refinements confirm the large 2VAD value. As mentioned above, our previous QM/MM study involved a deprotonated Glu148residue and employed unconstrained classical MD runs in the setup phase, with the result that W123 turned out to be quite mobile and moved close to Glu148- to establish a strong ionic

15420

J. Phys. Chem. B, Vol. 114, No. 46, 2010

Hsiao et al.

TABLE 4: Refinement of the Anionic trans-Peptide Form: Lowest Rfree Values, Corresponding wxref Weights, Lowest πfπ* Transition Energies, and Associated Oscillator Strengths (DFT/MRCI) from All Protocols (See Text); Eexp ) 2.58 eV protocol

wxref

Rfree

Eπfπ* (eV)

osc. str.

2VAD (trans-peptide) nonelectrostatic (i) nonelectrostatic (ii) electrostatic (i) electrostatic (ii) QM/MM (i) QM/MM (ii)

0.6 0.8 0.1 0.6 0.0 0.0

0.2122 0.2121 0.2156 0.2123 0.2153 0.2124 0.2677

2.37 2.63 2.62 2.61 2.62 2.58 2.62

1.117 1.191 1.054 1.203 1.205 1.189 1.201

Refining DsRed.M1 Containing the trans-Peptide Chromophore. The lowest Rfree values and the corresponding wxref weights from all the refinement protocols are collected in Table 4, together with the lowest πfπ* transition energies and the corresponding oscillator strengths. The starting structure for all refinements was the trans-peptide conformer from the 2VAD model. Protocol i. The structures obtained from the nonelectrostatic and electrostatic quantum refinements are very similar to each other and also close to the 2VAD structure. This is obvious from the overlays shown in Figure 5. The relevant bond lengths and torsion angles in the chromophore are listed in Table 5. The bond lengths generally do not differ much from the 2VAD values (typically by 0.01-0.02 Å, and always by less than 0.05 Å), and there are also only minor changes in the torsion angles (typically by a few degrees, and always by less than 20°). In the nonelectrostatic (electrostatic) refinement, the rmsd of the Phe65-Crq66-Ser69 moiety relative to the 2VAD structure is 0.10 (0.16) Å, and the largest atomic displacement in the chromophore amounts to 0.25 (0.15) Å for N of Crq66. Compared with 2VAD, the most obvious structural differences occur for the side chain of Gln66 (see Figure 5): these are most pronounced in the case of the electrostatic refinement where atoms CB1 and CG1 of Gln66 move by 0.53 and 0.44 Å, and the dihedral angle C1sCA1sCB1sCG1 changes from 144° to 178°.

hydrogen bond that was maintained in the subsequent QM/MM geometry optimization. In the present QM/MM work, the MD runs during setup are constrained (keeping the protein atoms from the 2VAD structure fixed) so that W123 remains far away from the Glu148 side chain (regardless whether it is anionic or neutral) and stays in its local environment (with hydrogen bonds to Val195, Ser146, and Ser197). The present QM/MM optimizations thus give OE1 Glu148-O W123 distances of 4.64 Å with deprotonated Glu148- and 4.84 Å with protonated Glu148, reasonably close to the values of 4.85-4.95 Å from the 2VAD and quantum refinements. We find in general that there is no crystal water molecule that moves very much away from its position in 2VAD during quantum refinement.

Figure 5. Overlays of trans-peptide chromophore structures (Crq66, Phe65, and Ser69) from protocol i: left, 2VAD (green) and nonelectrostatic refinement (pink); middle, 2VAD (green) and electrostatic refinement (purple); right, nonelectrostatic and electrostatic refinement.

TABLE 5: Comparison of Selected Bond Lengths (Å) and Torsion Angles (deg) in the trans-Peptide Chromophore: 2VAD Literature Values,23 Results from the Present Nonelectrostatic and Electrostatic Quantum Refinements Using Protocols i and ii, and Values from the Present QM/MM Optimizations with These Protocols (See Text) protocol

C1sCA1

CA1sN

NsC

CdO

2VAD nonelectrostatic (i)

1.514 1.492 1.515 1.484 1.513 1.518 1.524 1.502 1.516

1.418 1.439 1.461 1.432 1.471 1.450 1.459 1.438 1.468

1.330 1.352 1.352 1.349 1.361 1.364 1.366 1.357 1.365

1.223 1.235 1.242 1.234 1.242 1.237 1.239 1.232 1.241

nonelectrostatic (ii) electrostatic (i) QM/MM (i) electrostatic (ii) QM/MM (ii)

with restraint no restraint with restraint no restraint with restraint no restraint with restraint no restraint

protocol

CA1sNsCsO

C1sCA1sNsC

CD2sCG2sCA2sN2

2VAD nonelectrostatic (i)

4.0 19.0 5.5 20.7 5.0 5.6 0.6 23.2 0.2

133.7 149.1 136.4 152.3 137.1 141.3 138.1 153.5 136.9

1.0 1.0 0.2 0.5 3.6 9.1 14.7 3.9 3.3

nonelectrostatic (ii) electrostatic (i) QM/MM (i) electrostatic (ii) QM/MM (ii)

with restraint no restraint with restraint no restraint with restraint no restraint with restraint no restraint

Quantum Refinement of Protein Structures

Figure 6. Overlays of trans-peptide chromophore structures (Crq66, Phe65, and Ser69) from protocol ii: top, nonelectrostatic (pink) and electrostatic (purple) refinement; bottom, 2VAD (green), QM/MM (orange), nonelectrostatic and electrostatic refinement.

Protocol ii. The structures determined by quantum refinement using protocol ii are again similar to the 2VAD structure and to those obtained via protocol i (see Figure 6 and Table 5). In the nonelectrostatic (electrostatic) refinement, the rmsd of the Phe65-Crq66-Ser69 moiety relative to the 2VAD structure is 0.10 (0.11) Å, and the largest atomic displacement in the chromophore is 0.31 (0.31) Å for N of Crq66. There are again some structural differences in the side chain of Gln66 (see Figure 6), mostly in the electrostatic refinement, but they are less pronounced than those seen in protocol i. The hydrogen bonding network near the chromophore is found to be similar in the trans-peptide and the cis-acylimine forms of the chromophore (see above). DFT/MRCI. DFT/MRCI calculations were performed to assess the refined structures by comparison of the computed πfπ* transition energies ∆Eπfπ* with the experimental values. The results for the cis-acylimine and trans-peptide forms of the chromophore are collected in Tables 1 and 4, respectively. Using the original 2VAD model of the chromophore, DFT/ MRCI predicts ∆Eπfπ* ) 1.88 eV for the cis-acylimine chromophore. Adding a solvating water sphere around 2VAD and including the point charges from these water molecules increases the excitation energy slightly to 1.92 eV. By contrast, using the structures from the quantum refinements or from the QM/MM calculations yields DFT/MRCI results between 2.13 and 2.25 eV, which are close to the experimental value of 2.23 eV and lie well within the usual error bar (0.2 eV) of the DFT/ MRCI method (Table 1). The main differences between the underlying chromophore structures of the cis-acylimine form concern the bond lengths (phenolate vs quinoid) and the orientation of the CdO bond (perpendicular vs coplanar to the CA1sNsC moiety). The latter distinction is expected to be relevant in the present context because a nearly coplanar

J. Phys. Chem. B, Vol. 114, No. 46, 2010 15421 carbonyl group (as in 2VAD) extends the electronic conjugation in the cis-acylimine and will thus lead to lower excitation energies. On the other hand, a nearly perpendicular carbonyl group (as in the other refined structures, CA1sNsCsO dihedral angles between 87° and 113°) will not allow for such extended conjugation and thus cause a blue shift of the excitation energy. These qualitative expectations are confirmed by the plots of the highest occupied molecular orbital (HOMO) of the cisacylimine chromophore at geometries taken from 2VAD and from quantum refinement (see Figure 7, left side). Having rationalized the structural origin of the differences in the DFT/ MRCI excitation energies, we conclude that the chromophore structures from the present quantum refinements and QM/MM calculations are more realistic than the 2VAD structure because of the much better agreement between the computed and experimental excitation energy. The differences in the bond lengths (phenolate vs quinoid) should also affect the excitation energies, but to a lesser extent. In a test calculation, we modified the cis-acylimine structure obtained from the electrostatic refinement by changing the bond lengths within the C1sCA1sNsCsO moiety to the corresponding 2VAD values while retaining the dihedral angles: this lowered the DFT/MRCI excitation energy from 2.21 to 2.14 eV, that is, in the direction of the 2VAD value of 1.88 eV but still quite far away from it. Likewise, slight deviations from the planarity of the CD2sCG2sCA2sN2 moiety do not affect the computed ∆Eπfπ* values much: this can be seen from their small spread among the various quantum refinement and QM/ MM structures (Table 1) with dihedral angles CD2sCG2s CA2sN2 between 0 and 11° (Table 2). The DFT/MRCI excitation energies thus indeed seem to depend most strongly on the dihedral angle CA1sNsCsO which governs the extent of conjugation in the cis-acylimine. Turning to the trans-peptide form (Table 4), the calculated excitation energy is 2.37 eV at the 2VAD geometry and ranges between 2.58 and 2.63 eV at the other geometries. The latter results are in excellent agreement with the experimental value of 2.58 eV. The various structures for the trans-peptide are generally qualitatively similar (see Figures 5 and 6), and therefore the DFT/MRCI excitations differ less than in the case of the cis-acylimine. The lower value at the 2VAD geometry is presumably related to the somewhat shorter CA1sN and NsC bonds (Table 5). Experimentally, when going from the cisacylimine to the trans-peptide form, the first absorption is blueshifted by 0.35 eV which is well-reproduced by the DFT/MRCI results. This shift is again rationalized by the length of the conjugated system which is more extended in the cis-acylimine form because of the presence of the CdN moiety. In summary, the lowest πfπ* transition energy from DFT/ MRCI calculations at the structures obtained from quantum refinements and QM/MM optimizations agree very well with experiment, which indicates that these chromophore structures are realistic. Conclusions We have implemented quantum refinement in the modular QM/MM software ChemShell to enable improved structure determination of proteins and other biomacromolecules, by allowing for a QM rather than an MM description of the crucial regions of the molecule (e.g., an inhibitor in an active site or a chromophore). The full QM/MM functionality of ChemShell is available for quantum refinement including mechanical and electronic QM/MM embedding. Both of these schemes have been explored in an initial application to the red fluorescent

15422

J. Phys. Chem. B, Vol. 114, No. 46, 2010

Hsiao et al.

Figure 7. Highest occupied molecular orbital of the cis-acylimine (left) and trans-peptide (right) chromophores, computed at the 2VAD geometry (upper panel) and at the structure obtained from electrostatic refinement using protocol ii (lower panel).

protein DsRed.M1 using two protocols i and ii which target the determination of (i) only the chromophore structure and (ii) the whole protein structure. In this initial application, both schemes and both protocols lead to rather similar and mutually consistent results in general so that it would seem premature to favor one particular variant. Both conformers of the anionic chromophore of the DsRed.M1 protein were investigated using quantum refinement and pure QM/MM methods. The results were compared with the original crystallographic refinement (2VAD)23 and with our previous QM/MM study.25 These earlier investigations provide compatible structures for the trans-peptide conformer which are confirmed by the present work, except for some minor details with regard to the Gln66 side chain. In the case of the cisacylimine conformer, the previous studies differ in the proposed chromophore structure and also in the hydrogen bonding network around the chromophore. The present DFT-based quantum refinements and DFT/MM calculations on the cisacylimine conformer favor the chromophore structure from our previous QM/MM study and a hydrogen bonding network similar to 2VAD. The currently favored structures are further supported by the excellent agreement of the experimentally observed excitation energies with the values computed by DFT/ MRCI at these geometries. To be more specific with regard to the chromophore, the various quantum refinements and QM/MM calculations as well as simulated annealing by CNS yield an acylimine moiety with the carbonyl group almost perpendicular to the CA1sNsC plane, with the dihedral angle CA1sNsCsO ranging between

87° and 113°, so that the acylimine structure is actually an intermediate between the cis and the trans forms. According to the quantum refinement and QM/MM results, the acylimine chromophore has a phenolate rather than a quinoid structure. The lowest DFT/MRCI excitation energy at such geometries agrees well with the experiment, mainly because the CA1sNs CsO torsion limits the length of the conjugated system and causes a significant blue shift compared with the alternative coplanar CA1sNsCsO situation (2VAD). The computed πfπ* transition energy is less sensitive to moderate variations in the dihedral angles C1sCA1sNsC and CD2sCG2s CA2sN2 and also with regard to bond length alternations toward a quinoid structure. Concerning the hydrogen bonding network around the acylimine chromophore, the quantum refinements confirm the 2VAD assignments regardless of whether Glu148 is assumed to be deprotonated (anionic Glu148-) or protonated (neutral Glu148). By contrast, depending on the setup conventions, QM/ MM calculations may lead to different structures with deprotonated and protonated Glu148: our previous QM/MM optimization25 with Glu148- gave a Lys70-Glu148 salt bridge which was not formed in the present QM/MM calculations with neutral Glu148. Given the fact that the latter situation is preferred in all variants of quantum refinement, it seems likely that Glu148 is actually protonated in the crystal. In summary, quantum refinement is a combined experimental and theoretical approach which describes the special site of a macromolecule using a QM rather than an MM method. It should thus be intrinsically more accurate than a conventional

Quantum Refinement of Protein Structures crystallographic refinement since it avoids the errors caused by inadequate MM parameters. Quantum refinement is a robust procedure because of the use of X-ray data restraints, which may help to overcome modeling problems, for example, in the assignment of the protonation states of titratable residues in QM/ MM studies. Hence, quantum refinement has significant advantages both over conventional refinements and over purely theoretical QM/MM calculations and thus offers great potential for determining realistic structures of biomacromolecules. Acknowledgment. This work was supported by the Max Planck Society and the Volkswagenstiftung. References and Notes (1) The Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank. http://www.pdb.org (accessed Sept 29, 2010). (2) Ryde, U.; Olsen, L.; Nilsson, K. J. Comput. Chem. 2002, 23, 1058. (3) Senn, H. M.; Thiel, W. Angew. Chem., Int. Ed. 2009, 48, 1198. (4) Hsiao, Y.-W.; Drakenberg, T.; Ryde, U. J. Biomol. NMR 2005, 31, 97. (5) Hsiao, Y.-W.; Tao, Y.; Shokes, J. E.; Scott, R. A.; Ryde, U. Phys. ReV. B: Condens. Matter Mater. Phys. 2006, 74, 214101/1. (6) Ryde, U.; Hsiao, Y.-W.; Rulisˇek, L.; Solomon, E. I. J. Am. Chem. Soc. 2007, 129, 726. (7) Hsiao, Y.-W.; Ryde, U. Inorg. Chim. Acta 2005, 359, 1081. (8) Yu, N.; Yennawar, H. P.; Merz, K. M., Jr. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2005, 61, 322. (9) Yu, N.; Hayik, S. A.; Wang, B.; Liao, N.; Reynolds, C. H.; Merz, K. M., Jr. J. Chem. Theory Comput. 2006, 2, 1057. (10) Li, X.; He, X.; Wang, B.; Merz, K., Jr. J. Am. Chem. Soc. 2009, 131, 7742. (11) Li, X.; Hayik, S. A.; Merz, K. M., Jr. J. Inorg. Biochem. 2010, 104, 512. (12) Brunger, A. T.; Adams, P. D.; Clore, G. M.; Delano, W. L.; Gros, P.; Grosse-Kunstleve, R. W.; Jiang, J.-S.; Kuszewski, I.; Niges, M.; Pannu, N. S.; Read, R. J.; Rice, L. M.; Simonson, T.; Warren, G. L. Crystallography & NMR system CNS, Version 1.2; Yale University: New Haven, CT, 2000. (13) Case, D.; Darden, T.; Cheatham, T. E. I.; Simmerling, C.; Wang, J.; Duke, R.; Luo, R.; Merz, K.; Wang, B.; Pearlman, D. A.; Crowley, M.; Brozell, S.; Tsui, V.; Gohlke, H.; Mongan, J.; Hornak, V.; Cui, G.; Beroza, P.; Schafmeister, C.; Caldwell, J. W.; Ross, W. S.; Kollman, P. A. AMBER 8; University of California: San Francisco, CA, 2004. (14) Sherwood, P.; de Vries, A. H.; Guest, M. F.; Schreckenbach, G.; Catlow, C. R. A.; French, S. A.; Sokol, A. A.; Bromley, S. T.; Thiel, W.; Turner, A. J.; Billeter, S.; Terstegen, F.; Thiel, S.; Kendrick, J.; Rogers, S. C.; Casci, J.; Watson, M.; King, F.; Karlsen, E.; Sjovoll, M.; Fahmi, A.; Schäfer, A.; Lennartz, C. THEOCHEM 2003, 632, 1. (15) ChemShell, a Computational Chemistry Shell. http://www. chemshell.org (accessed Sept 29, 2010). (16) Ahlrichs, R.; Bär, M.; Häser, M.; Horn, H.; Kölmel, C. Chem. Phys. Lett. 1989, 162, 165. (17) Smith, W.; Forester, T. R. J. Mol. Graphics 1996, 14, 136. (18) MacKerell, A. D. J.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; JosephMcCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.;

J. Phys. Chem. B, Vol. 114, No. 46, 2010 15423 Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E. I.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. J. Phys. Chem. B 1998, 102, 3586. (19) Yarbrough, D.; Wachter, R. M.; Kallio, K.; Matz, M. V.; Remington, S. J. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 462. (20) Tsien, R. Y. ReV. Biochem. 1998, 67, 509. (21) Gross, L. A.; Baird, G. S.; Hoffman, R. C.; Baldridge, K. K.; Tsien, R. Y. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 11990. (22) Boye, S.; Brondsted Nielsen, S.; Krogh, H.; Bloch Nielsen, I.; Pedersen, U. V.; Bell, A. F.; He, X.; Tonge, P. J.; Andersen, L. H. Phys. Chem. Chem. Phys. 2003, 5, 3021. (23) Strongin, D. E.; Bevis, B.; Khuong, N.; Downing, M. E.; Strack, R. L.; Sundaram, K.; Glick, B. S.; Keenan, R. J. Protein Eng., Des. Sel. 2007, 20, 525. (24) Baird, G.; Zacharias, D. A.; Tsien, R. Y. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 11984. (25) Sanchez-Garcia, E.; Doerr, M.; Hsiao, Y.-W.; Thiel, W. J. Phys. Chem. B 2009, 113, 16622. (26) Brunger, A. T.; Adams, P. D. Acc. Chem. Res. 2002, 35, 404. (27) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4, 187. (28) Becke, A. D. Phys. ReV. A 1988, 38, 3098. (29) Perdew, J. P. Phys. ReV. B 1986, 33, 8822. (30) Hehre, W. J.; Ditchfield, R.; Pople, J. A. J. Chem. Phys. 1972, 56, 2257. (31) Hersleth, H.-P.; Hsiao, Y.-W.; Ryde, U.; Görbitz, C. H.; Andersson, K. K. Biochem. J. 2008, 412, 257. (32) Brunger, A. T. Nature 1992, 355, 472. (33) Murshudov, G. N.; Vagin, A. A.; Dodson, E. J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 1997, 53, 240. (34) Bakowies, D.; Thiel, W. J. Phys. Chem. 1996, 100, 10580. (35) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 1983, 79, 926. (36) de Vries, A. H.; Sherwood, P.; Collins, S. J.; Rigby, A. M.; Rigutto, M.; Kramer, G. J. J. Phys. Chem. B 1999, 103, 6133. (37) Sherwood, P.; de Vries, A. H.; Collins, S. J.; Greatbanks, S. P.; Burton, N. A.; Vincent, M. A.; Hillier, I. H. Faraday Discuss. Chem. Soc. 1997, 79. (38) Adams, P. D.; Pannu, N. S.; Read, R. J.; Brunger, A. T. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 5018. (39) VMD-Visual Molecular Dynamics. http://www.ks.uiuc.edu/Research/ vmd/ (accessed Sept 29, 2010). (40) Grimme, S.; Waletzke, M. J. Chem. Phys. 1999, 111, 5645. (41) Silva-Junior, M. R.; Schreiber, M.; Sauer, S. P. A.; Thiel, W. J. Chem. Phys. 2008, 129, 104103. (42) Becke, A. D. J. Chem. Phys. 1993, 98, 1372. (43) Schäfer, A.; Huber, C.; Ahlrichs, R. J. Chem. Phys. 1994, 100, 5829. (44) Schäfer, A.; Horn, H.; Ahlrichs, R. J. Chem. Phys. 1992, 97, 2571. (45) Gerenkamp, M. Entwicklung und Anwendung quantenchemischer Methoden zur Berechnung komplexer chemischer Systeme. Ph.D. thesis, Universita¨t Mu¨nster, Mu¨nster, Germany, 2005. (46) Parac, M.; Doerr, M.; Marian, C. M.; Thiel, W. J. Comput. Chem. 2010, 31, 90. (47) Wiberg, K. B.; Rablen, P. R.; Marquez, M. J. Am. Chem. Soc. 1992, 114, 8654.

JP108095N