Influence of Experimental Uncertainties on the Properties of

Nov 19, 2010 - Influence of Experimental Uncertainties on the. Properties of Ensembles Derived from NMR Residual. Dipolar Couplings. R. Bryn Fenwick,...
0 downloads 0 Views 2MB Size
pubs.acs.org/JPCL

Influence of Experimental Uncertainties on the Properties of Ensembles Derived from NMR Residual Dipolar Couplings R. Bryn Fenwick,† Santi Esteban-Martín,† and Xavier Salvatella*,†,‡ †

Institute for Research in Biomedicine Barcelona, Baldiri Reixac 10-12, 08028 Barcelona, Spain, and ‡ICREA, Barcelona, Spain

ABSTRACT Ensemble simulations restrained by residual dipolar couplings (RDCs) measured by NMR have in recent years emerged as a powerful strategy for the experimental characterization of the structural heterogeneity of macromolecules. Given the wide range of potential applications of this approach, it is important to determine whether the resulting ensembles are affected by unavoidable experimental uncertainties in the measurement of RDCs. Using the wellcharacterized protein ubiquitin as a model system, we have assessed the effect of experimental error and conclude that the structural heterogeneity in solution can be accurately characterized when experimental uncertainties in the NH RDCs are lower than 0.5 Hz. Our results indicate that determining native ensembles using ensemble simulations restrained by this NMR parameter is a robust procedure. SECTION Dynamics, Clusters, Excited States

O

ver the past decade our understanding of the relationship between the structure, dynamics, and function of biological macromolecules has improved significantly.1 This has occurred as a result of the availability of experiments that report on the amplitude of structural fluctuations at specific sites,2 improvements in the quality of molecular mechanics force fields,3 and greater integration of experiments and molecular simulations.4 Residual dipolar couplings (RDCs) measured using NMR spectroscopy are useful probes of molecular dynamics because they average on relatively long time scales and can be reliably predicted from conformational ensembles.5 RDCs depend on the orientation of bond vectors in a molecular frame defined by the alignment of the macromolecule with respect to the magnetic field and, under the assumption that alignment is constant, can be used to analyze the structural heterogeneity of macromolecules. This can be accomplished by using RDCs, that can be measured in large numbers and for a wide range of bond vectors, to bias molecular simulations. The impact of the constant alignment assumption6,7 as well as that of structural noise8 on descriptions of molecular dynamics has been analyzed9,10 but that of experimental uncertainties (σ) remains unexplored. To address this we have used a molecular dynamics simulation of ubiquitin as a reference ensemble11 to compute up to nine sets of synthetic NH RDCs. The first alignment was determined by singular value decomposition (SVD) fit of the reference ensemble to the RDCs measured in stretched polyacylamide gel by Ruan and Tolman,12 and the remaining sets were obtained using the Euler rotations used by Meiler et al.13 (see Supporting Information). The tensors were assumed to remain constant in the time scale sampled by

r 2010 American Chemical Society

RDCs, i.e., their fluctuations were assumed to be uncorrelated to the molecular dynamics object of analysis.6,7,10 We tested nine sets, which leads to some overdetermination of the system and to an improvement of the signal-to-noise ratio, as well as a random subset of five sets, where the system is not overdetermined (results using five sets of RDCs are qualitatively similar and are presented in the Supporting Information). These RDCs were restrained (see ref 11) to determine ensembles with increasing statistical noise. This powerful approach is similar to that used to thoroughly assess the effect of experimental and simulation parameters on the resulting structural ensembles.11,14 We modeled experimental uncertainty (σ) as statistical random Gaussian noise, which was added to the RDCs backcalculated from the reference ensemble. To cover the range of experimental errors reported in literature, we tested errors between 0.1 and 1.0 Hz on typical NH RDCs with Dmax ≈ 10.0 Hz, which allowed us to explore errors from 1% to 10%. We found that, for low values of σ, the restrained simulations improved the agreement with the RDCs of the reference ensemble from Q = 0.46 to Q = 0.08 (Figure 1). Values of σ larger than 0.5 Hz compromised the ability of the simulations with nine sets to yield ensembles with low violations (for five sets, Q shows a marked deterioration with σ larger than 0.3 Hz). We note that the agreement (Q=0.09) for ensembles determined using values of σ = 0.3 Hz is comparable to that obtained using real experimental data with an error Received Date: October 1, 2010 Accepted Date: November 10, 2010 Published on Web Date: November 19, 2010

3438

DOI: 10.1021/jz101358b |J. Phys. Chem. Lett. 2010, 1, 3438–3441

pubs.acs.org/JPCL

Figure 1. Analysis of the ability of ensemble simulations restrained by NH RDCs with increasing values of synthetic experimental error (σ) to minimize violations and generate accurate average structures. The agreement with the pseudoexperimental RDCs is expressed as QRDC = rms(Dexp - Dcalc)/rms(Dcalc), and the accuracy of the average structure is expressed as rmsd with respect to the average structure of the reference ensemble.

of ∼0.3 Hz.4 The ability of the restrained simulations to reproduce the average structure of the reference ensemble was also assessed (Figure 1). The effect of experimental uncertainty was correlated with the ability to minimize the violations of the RDCs, i.e., higher values of σ led to average structures that deviated from the average structure of the reference ensemble. Since RDCs encode the structural heterogeneity of bond vectors, they are particularly valuable for the study of protein dynamics in the submillisecond time scale.5 Most often dynamics are reported in terms of order parameters (S2) and are related to biological function. Therefore we analyzed the level of agreement of the NH order parameters computed from the ensembles derived from the RDCs with those computed from the reference ensemble. As shown in Figure 2 the restrained simulations provide, in the absence of experimental uncertainties, an accurate description of the dynamics of the bond vectors in the reference ensemble (correlation coefficient; F = 0.98). In ensemble simulations restrained by RDCs, the violations between calculated and experimental RDCs are minimized by optimizing the five independent elements of the alignment tensor simultaneously with the coordinates of each ensemble member throughout the simulation; this renders the result of the calculation robust to the presence of structural noise in the model used to obtain the initial estimate of the tensor. Any structural fluctuation that affects all RDCs similarly, such as bond vector librations, is absorbed by the tensor optimization routine. Accordingly, we observed that the order parameters computed from the ensembles obtained by restraining the reference ensemble-derived RDCs were systematically 2% higher than those of the reference ensemble. As discussed by Clore and Schwieters15 as well as by us,10 this is due to the lack of knowledge of the degree and direction of alignment of the protein in each of the several alignment media. Emerging algorithms that compute the alignment tensor explicitly from the coordinates of the ensemble members have the potential to alleviate this problem.16,17 We fitted a scaling

r 2010 American Chemical Society

Figure 2. NH S2 values for ensembles (S2ensemble) of ubiquitin compared to the reference S2RDC values (S2reference). Point colors indicate secondary structure type: helix (red), beta-sheet (blue), loops and turns (green). The S2 values were scaled as explained in the text.

factor between the reference order parameters and the order parameters determined from the 0.0 Hz ensemble. The consideration of this scaling factor leads to accurate determination of the order parameters of the reference ensemble, with deviations that are lower than 5% (Figure 2 with σ=0.0). This result is important, as it indicates that RDCs, when measured accurately, contain sufficient information to describe the structural heterogeneity of native ensembles using restrained ensemble simulations. To test the impact of experimental uncertainties in the structural heterogeneity of proteins, we back-calculated the NH order parameters of ensembles obtained with increasing values of σ (using the correction factor determined above) and compared them to those of the reference ensemble. The results indicate that the agreement between the backcalculated order parameters and those of the reference ensemble is sensitive to experimental uncertainties; F decreases from a value of 0.98 for σ = 0 Hz, to 0.85 for σ = 0.5 Hz, and to 0.78 for σ = 1.0 Hz. For σ values similar to experimental uncertainty (i.e., 0.3 Hz)4 the back-calculated order parameters are in good agreement with those of the reference ensemble (F = 0.91). Thus ensemble restraining has robustness equivalent to that of the 3D-GAF model and the iterative DIDC method.18,19 For values of σ recently described in the literature

3439

DOI: 10.1021/jz101358b |J. Phys. Chem. Lett. 2010, 1, 3438–3441

pubs.acs.org/JPCL

(