Molecular Modeling of Nucleic Acids - American Chemical Society

many nucleic acids the base-pairs interact only with their nearest neighbours. Polymers of this kind can be easily deformed by small anisotropic force...
1 downloads 0 Views 2MB Size
Chapter 7

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on February 1, 2016 | http://pubs.acs.org Publication Date: November 26, 1997 | doi: 10.1021/bk-1998-0682.ch007

Conformational Analysis of Nucleic Acids: Problems and Solutions Andrew N. Lane Division of Molecular Structure, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, United Kingdom

There are numerous problems associated with the determination of nucleic acid structures in solution using NMR methods. These include spin diffusion, anisotropic rotation, conformational averaging, low density of experimental constraints and lack of long-range information. The difficulties arising from the first two problems can be treated reliably, but new experimental methods are needed for the last two. To deal effectively with conformational averaging, newer computational procedures are required. Approaches to these problems are described, and illustrated with an example of a DNA.RNA hybrid duplex. Nucleic acids can adopt a wide variety of structures depending on composition, sequence and environmental conditions. This reflects the intrinsic plasticity of oligonucleotides. Not only are the nucleotide units themselves flexible, but also there is a large number of degrees of freedom in the phosphodiester linkage. In addition, in many nucleic acids the base-pairs interact only with their nearest neighbours. Polymers of this kind can be easily deformed by small anisotropic forces such as intercalation by small ligands, or formation of complexes with proteins as exemplified by C A P (7) and SRY(2). Further, one of the many interesting aspects of nucleic acids is the influence of base-pair mismatches on conformation. Studying mismatches is important in understanding D N A mutations, and in the phylogenetically conserved 'mismatches' found in many R N A species. As a rule, mismatches are thermodynamically destabilising (3), which is often associated with increased local flexibility in both DNA and RNA (4,5). Even ordinary duplexes can show significant conformational heterogeneity on the millisecond time scale (6-9). Such behaviour poses a serious challenge to describing the conformational properties of nucleic acids in solution. In this article, I will summarise the problems of determining solution structures from N M R data and molecular modelling, and outline some outstanding problems that may need to be addressed for a fuller understanding of their functional properties. Information Content If a nucleic acid is considered as a rigid body, then there is a fixed number of parameters needed to describe its structure. In terms of torsion angles (i.e. assuming 106

© 1998 American Chemical Society

In Molecular Modeling of Nucleic Acids; Leontis, Neocles B., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1997.

7. LANE

Conformational Analysis of Nucleic Acids

107

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on February 1, 2016 | http://pubs.acs.org Publication Date: November 26, 1997 | doi: 10.1021/bk-1998-0682.ch007

that bond length and angles are known, as well as the chemical structure), then there are 8n-4 parameters per strand, or 16n-8 +6 parameters per duplex to be determined (cf. Structure I). Hence, for a decamer duplex, there would be 158 parameters. As a torsion angle cannot be described by a single distance, the minimum number of NOEs or distances needed is larger than this. Nevertheless, for a well resolved N M R spectrum, one can expect to be able to extract a sufficient number of experimental distance and dihedral constraints such that the problem is at least in principle overdetermined. Structure I NH

2

For example, the conformation of a rigid nucleotide can be specified by 4 parameters, namely the glycosidic torsion angle χ, the angles δ and γ, and a second endocyclic torsion, which together with δ defines the pseudorotation phase angle Ρ and the maximum amplitude of the sugar pucker 0m- The sugar pucker parameters can be determined from analysis of two or more coupling constants, which in favourable cases can be measured from various COSY variants (and see below). There are also two intraring proton-proton dipolar interactions in D N A that vary significantly over the pseudorotation phase cycle, namely ΗΓ-Η4', which has a maximum at P=90° (04'-endo), and H2"-H4' which has a maximum near P=18°(C3'endo) and a minimum near P»198° (C3'-exo). The remaining intraring proton-proton distances do not vary significantly with sugar conformation, and therefore provide no restraining power in distance-based algorithms. The base proton (H8 or H6)

In Molecular Modeling of Nucleic Acids; Leontis, Neocles B., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1997.

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on February 1, 2016 | http://pubs.acs.org Publication Date: November 26, 1997 | doi: 10.1021/bk-1998-0682.ch007

108

MOLECULAR MODELING OF NUCLEIC ACIDS

interaction with several sugar protons is a strong function of the glycosidic torsion angle, especially the H2'-H8/H6. However, the ΗΓ-Η8/Η6 NOE, perhaps the easiest to measure, is only weakly dependent on χ in the anti domain, and therefore for many nucleic acids it has minimal restraining power, though for syn nucleotides, this NOE provides the major source of information about χ. To determine the helical conformation, there are again more experimental data available than the maximum number of degrees of freedom, i.e. three rotations and three translations for unlinked dinucleotides, or the torsions α,β,ε,ζ for a linked dinucleotide (Structure I). 7-10 distances per dinucleotide (some of which are not completely independent) can be obtained from NOEs and in favourable cases, additional constraints on ε and β can be obtained from heteronuclear couplings, though in general these are rather loosely defined (10). The finite size of the constituent atoms makes some combinations of parameters physically impossible, so that they cannot be treated as independent, but rather conditionally dependent parameters. The degree to which a structure is determined is a compromise between the number of constraints, their quality (which includes precision), and their sensitivity to variation of structural parameters (i.e. their restraining power). At a trivial level, a single exact constraint cannot determine the structure in any sensible way. Similarly, all possible distances known to very low precision (say between 0 and 10 Â), will also not specify the structure. The minimum number of constraints required is equal to the number of degrees of freedom in the system (within the context of conditional probabilities). At this constraint density, the precision of the structure will depend strongly on the quality of the constraints, and the accuracy of any force-field that is used for calculating structures. Weak restraints that do not further reduce the conformational space should not be included in a count of restraints. The quality of a restraint may also vary during refinement; once the gradient of a restraint with respect to a parameter (e.g. a torsion angle) approaches zero, that particular restraint no longer contains much information about the value of the parameter. Furthermore, a high density of constraints will only limit the available conformational space to a finite amount. Once the constraints are satisfied, any further convergence to a particular structure will be entirely determined by factors such as the algorithm used to generate the structures, the starting conditions, and the forcefield parameterisation. Ideally, a method is needed that is independent of calculation strategy, the starting point, and the force field. Factors Which Affect the Number and Accuracy of Constraints Coupling Constants. Coupling constants are extremely valuable for determining torsion angles. In practice there are several potential problems in extracting the information contained in the experimental splittings. These include the complexity of some multiplets, the large line-width of some resonances (particularly H2' and H2" in DNA), strong coupling, and the possibility of interference between dipolar and scalar coupling. Several groups, notably that of James (77,72) have shown that simulation of COSY cross-peaks can lead to improved estimates of individual coupling constants. However, for larger nucleic acid fragments, the influence of line-width becomes problematic, and accurate simulations require that the effective T2 is known. It has also been shown that for reliable determination of coupling constants, good digitisation of the spectra is needed (73). Furthermore, dipolar interactions can affect the apparent splittings, which is probably most severe for coupling to H2' and H2 in DNA (14-16). To some extent, this problem can be alleviated by using the sums of the coupling constants, as they are less sensitive to the interference from dipolar M

In Molecular Modeling of Nucleic Acids; Leontis, Neocles B., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1997.

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on February 1, 2016 | http://pubs.acs.org Publication Date: November 26, 1997 | doi: 10.1021/bk-1998-0682.ch007

7. LANE

Conformational Analysis of Nucleic Acids

109

effects, and the information content of three such sums is the same as that of three individual coupling constants (77). The extent of this problem remains unclear at present, though it may become significant for individual coupling constants when the effective correlation time exceeds 3 ns (ca. 12 base-pairs at 30°C). For correlation times much above 5-6 ns, the resonances become so broad that accurate coupling constants cannot be reliably retrieved in any case (13). The parameterisation of the Karplus equation also limits the accuracy of the desired torsions that can be retrieved. The uncertainty of the Karplus curve may be in the range 0.5 to 1 Hz, depending on the exact values of Ρ and 0m. Furthermore, in DNA where the sugars are usually in the ' S ' domain, there is only a weak dependence of the coupling constants on P: Ji»2«, Ji'2» and J2"3» vary only slightly for 100°