J. Phys. Chem. B 2007, 111, 9153-9164
9153
Leading RNA Tertiary Interactions: Structures, Energies, and Water Insertion of A-Minor and P-Interactions. A Quantum Chemical View Judit E. Sˇ poner,*,† Kamila Re´ blova´ ,† Ali Mokdad,‡ Vladimı´r Sychrovsky´ ,†,§ Jerzy Leszczynski,| and Jirˇ´ı Sˇ poner*,†,§ Institute of Biophysics, Academy of Sciences of the Czech Republic, Kra´ loVopolska´ 135, 612 65 Brno, Czech Republic, Department of Biochemistry and Biophysics, School of Medicine, UniVersity of California at San Francisco, San Francisco, California 94158, Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, FlemingoVo na´ meˇ sti´ 2, 166 10 Prague 6, Czech Republic, and Department of Chemistry, Computational Center for Molecular Structure and Interactions, Jackson State UniVersity, Jackson, Mississippi 39217 ReceiVed: January 18, 2007; In Final Form: May 4, 2007
Complex molecular shapes of ribosomal RNA molecules are stabilized by recurrent types of tertiary interactions involving highly specific and conserved non-Watson-Crick base pairs, triplets, and quartets. We analyzed the intrinsic structure and stability of the P-motif and the four basic types of A-minor interactions (types I, II, III, and 0), which represent the most prominent RNA tertiary interaction patterns refined in the course of evolution. In the studied interactions, the electron correlation component of the stabilization usually exceeds the Hartree-Fock (HF) term, leading to a strikingly different balance of forces as compared to standard base pairing stabilized primarily by the HF term. In other words, the A-minor and P-interactions are considerably more influenced by the dispersion energy as compared to canonical base pairs, which makes them particularly suitable to zip the folded RNA structures that are substantially hydrated even in their interior. Continuum solvent COSMO calculations confirm that the stability of the canonical GC base pair is affected (reduced) by the continuum solvent screening considerably more than the stability of the A-minor interaction. Among the studied systems, the strong A-minor II and weak A-minor III interactions require water molecules to stabilize the experimental geometry. Gas-phase optimization of the canonical A-minor II A/CG triplet without water results in a geometry that is clearly inconsistent with the RNA structure. The gas-phase structure of the P-interaction and the most stable A-minor I interaction nicely agrees with the geometries occurring in the ribosome. A-minor I can also adopt an alternative water-mediated substate rather often observed in X-ray and molecular dynamics studies. The A-minor I water bridge, however, does not appear to stabilize the tertiary contact, and its role is to provide structural flexibility to this binding pattern within the context of the RNA structure. Interestingly, the insertion of a polar water molecule in the A-minor I A/CG tertiary contact occurring in the A/C tertiary pair is stabilized primarily by the HF (electrostatic) interaction energy, while the dispersioncontrolled A/G contact remains firmly bound. Thus, the intrinsic balance of forces as revealed by quantum mechanics (QM) calculations nicely correlates with many behavioral aspects of the studied interactions inside RNA. The comparison of interaction energies computed using quantum chemistry and an AMBER force field reveals that common molecular mechanics calculations perform rather well, except that the strength of the P-interaction is modestly overestimated. We also briefly discuss the non-negligible methodological differences when evaluating simple base-base nucleic acids base pairs and the complex RNA tertiary base pairing patterns using QM procedures.
Introduction Ribosomes are formidable molecular machines present in all living cells to synthesize proteins with a high speed and efficiency based on mRNA templates. Each ribosome consists of two main subunits and many proteins, and the mass ratio of RNA/protein is ca. 2:1. All key functions of ribosomes are performed by their RNA components. A broad repertoire of secondary and tertiary interactions contributes to the final outlook of these highly organized macromolecular assemblies.1 Principles of RNA base pairing are strikingly different from those of DNA due to several factors, such as the absence of the * Corresponding authors. E-mail: (J.E.Sˇ .)
[email protected] and (J.Sˇ .)
[email protected]. † Institute of Biophysics, Academy of Sciences of the Czech Republic. ‡ University of California at San Francisco. § Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic. | Jackson State University.
highly reactive 2′-OH group in DNA, the different sugar puckers, and the resulting different types of helices (B vs A for DNA and RNA, respectively) that make natural DNA structures less diverse than RNA. Natural selection then puts a distinct pressure on the evolution of DNA and RNA molecules, and the direct result is that ca. 40% of ribosomal nucleotides form non-Watson-Crick (WC) base pairs by using different base edges (Figure 1).2 Some of these base pairs combine together to form patterns with a higher complexity like base triplets or quartets. The non-WC interactions are responsible for shaping up structures of key junctions, internal loops, and hairpin loops that mediate tertiary contacts between distant parts of the ribosome. RNA base pairs have been categorized by Leontis et al. according to their ability to substitute for each other in 3-D architectures.2 This qualitative classification resulted in 12 main and some intermediate (bifurcated) base pair families, which
10.1021/jp0704261 CCC: $37.00 © 2007 American Chemical Society Published on Web 06/29/2007
9154 J. Phys. Chem. B, Vol. 111, No. 30, 2007
Sˇ poner et al.
Figure 1. Nomenclature of the RNA base pairing.2 (a) Classification of the interaction sites in purine and pyrimidine nucleobases. Most (albeit not all) RNA base pairs observed in the RNA structures can be classified by systematically considering mutual interactions between the three nucleotide edges and the cis or trans orientation of the attached sugars.2 (b) Type II A-minor tertiary interaction between G2453, A692, and C2439 in the large ribosomal subunit of H.m. Crystal data were taken from the PDB file 1S72. A692 from the single stranded stem (green) contacts the double helical stem (light blue). The left and right sides of the figure show the same interaction viewed from different spectator points.
are further divided into isosteric subgroups. Each isosteric subgroup primarily can be characterized by the distance between the C1′ atoms and the mutual orientation of the corresponding vectors between C1′ and the glycosidic N (N1 in pyrimidines and N9 in purines). According to the isostericity principle, isosteric base pairs can often substitute for each other without affecting the RNA 3-D structure. Since the function of these molecules depends mainly on their 3-D structure, isosteric substitutions are presumed to have a minimal effect on function.3 The available structural methods show basically static RNA structures and provide only limited insights into the dynamics of molecular interactions. They also do not reveal the magnitude of intermolecular forces between interacting groups and nucleotides. Thus, computational methods (molecular modeling and quantum chemistry) can complement the information available in atomic resolution structures. This way, a panoramic understanding of intermolecular forces and movements can be achieved. Recently, we characterized structures and energies of four of the most important RNA base pairing families using quantum chemistry.4 Our theoretical modeling indicated that the majority of the studied non-WC base pairs is an intrinsically stable and well-defined building unit of the RNA architecture, with characteristic structural features. This was somewhat surprising since many of these base pairs looked rather weakly paired, and it was expected that they were formed only in the framework of larger RNA segments. The few exceptions, showing sub-
stantial rearrangements, were assigned to base pairs that were originally detected as an integral part of more complex interactions, like base quartets.4c A characteristic feature of many nonWC base pairs is the increased role of dispersion stabilization, which makes them particularly suitable for zipping the complex and stable 3-D structures, even when the H-bonding seems weak. The calculations also revealed that the leading molecular mechanical force field AMBER provides a satisfactory description of the interactions involving the 2′-OH group, which contributes to the, so far, very good performance of the force field in large-scale atomic level simulations of RNAs.5 In the present study, we move forward and characterize the strength and structures of such ubiquitous RNA tertiary interactions like the different types of A-minor interactions6 and P-interactions,7a,b both of which are types of ribose zippers (i.e., tertiary interactions stabilized by H-bonding between ribose 2′OH groups).7c,d A-minor interactions are the most frequent tertiary interactions in the large ribosomal subunit6,8,9 and other large functional RNAs. A-minor motifs involve usually two to three consecutive adenines inserted into the minor groove of a double helix, preferably interacting with GC WC base pairs. There are hundreds of adenines in the ribosome involved in this powerful interaction, with most of them showing a very high sequence conservation, underlining their key biological functions. A-minor interactions stabilize key rRNA 3-D segments including kink turns.10 The vast majority of A-minor interactions form long-
Leading RNA Tertiary Interactions range contacts and connect secondary structure elements in different domains and subdomains across the ribosomal subunits. A-minor interactions also participate in intersubunit bridges.11 Conserved adenines A1492 and A1493 from helix 44 of the small ribosomal subunit discriminate between cognate and nearcognate tRNAs by monitoring the shape of the codon-anticodon double helix between mRNA and tRNA via dynamical A-minor interactions.12 There are four different types of A-minor interactions, all mediated by an adenine contacting the minor groove of a receptor double helical stem to different extents (see Figure 2). The most compact A-minor type I, which is basically equivalent to a trans-sugar edge/sugar edge (hereafter abbreviated as SE/ SE) interaction complemented by a cis-SE/SE interaction, is stabilized by H-bonding interactions involving the O2′, N3, and N1 atoms of the adenosine with both bases of the WC base pair (Figure 2a). (The cis and trans nomenclature refers to the orientation of the glycosidic bonds and determines whether the sugars are on the same side or opposite sides of a line linking the interacting nucleotide edges.) A type II contact, equivalent to a cis-SE/SE interaction,2 is slightly less tight and is characterized by H-bonding involving the O2′ and N3 of adenine with basically one of the nucleotides of the WC base pair (Figure 2b). Type III A-minor interactions are even less packed with only N1 of the adenine forming H-bonds with the receptor double helix (Figure 2c). This contributes to the relatively low frequency of this A-minor interaction in RNA architectures. Finally, the type 0 A-minor interaction (Figure 2d) is stabilized by H-bonding between the ribose of the protruding nucleotide and the receptor double helix. Types I and II A-minor interactions are highly adenine specific, due to the involvement of O2′ and N3 atoms of the interacting adenosine. The receptor double helix WC base pair is preferably a GC one (in any orientation), although the AU base pair is also often seen.6a,b,7d The wobble base pair in the acceptor double helix is not favorable for the formation of the A-minor interaction. In contrast, types 0 and III are not adenosine specific. It is believed that the type I A-minor interaction represents the most stable A-minor contacts, while the less conserved patterns (i.e., types 0 and III) are the weakest. The type II and III A-minor interactions can be obtained starting from the A-minor I geometry by shifting the adenine (relative to the WC pair) in such a way that the adenosine sugar is finally displaced from the minor groove of the receptor WC base pair (Figure 2). The type 0 interaction can be obtained by shifting the adenosine in the opposite direction, so that its sugar points to the minor groove of the WC base pair. An A-minor motif is typically assembled by two consecutive A-minor interactions, where 5′-A makes a type II interaction and 3′-A makes a type I interaction with the receptor WC stem.6 The canonical or most prevalent A-minor motif occurs when the receptor WC stem has a 5′-CC-3′ sequence in the strand making the sugar zipper; this receptor strand is antiparallel to the protruding 5′-AA-3′ strand.6,7d The A-minor I and II interactions investigated in this study correspond to such a canonical A-minor motif. The P-interaction is less frequent but apparently even more robust than the A-minor motifs.7a It typically involves a GU wobble pair from one helix interacting with a GC base pair of another helix, defining key long-range tertiary contacts in the ribosome. As we have demonstrated recently, it is associated with a unique covariation pattern involving all four participating nucleotides and is highly conserved (Figure 2e).7b Besides presenting key static interactions that stabilize the precise
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9155
Figure 2. X-ray geometries used for modeling the A-minor interactions suggested in ref 6 and the P-motif. (a) Type I, (b) type II, (c) type III, and (d) type 0 A-minor interaction patterns. Cartesian coordinates were taken from the PDB file RR0033. (e) P-motif, Cartesian coordinates taken from the PDB file 1S72.
ribosomal architecture, both A-minor and P-interactions were suggested to form dynamic contacts and switches involved in large-scale movements of the ribosome during translocation.7b,9,13 In the current paper, we apply computational quantum chemistry to evaluate the intrinsic structural properties and
Sˇ poner et al.
9156 J. Phys. Chem. B, Vol. 111, No. 30, 2007 strength of these fundamental RNA tertiary interaction types, compare them with other base pairing types in RNA, and verify the Cornell et al. molecular mechanical force field commonly used in classical molecular modeling.14 Computational Methods Geometry Optimizations. Initial structures of the A-minor motifs were built up on the basis of crystal geometries, using structures listed in Figure 1 of ref 6. Three of them (types I-III) were taken from the 23S rRNA of Haloarcula marismortui (H.m.) 50S subunit (NDB code RR0033, ref 8), comprising the following nucleoside units: (i) G1364, C637 and A521 (type I, Figure 2a); (ii) G1363, C638 and A520 (type II, Figure 2b); (iii) U1362, A639 and A519 (type III, Figure 2c). The model representing the type 0 interaction (Figure 2d) was constructed using A957 and U1009 from the 23S rRNA and A104 from the 5S rRNA from the 50S subunit. Thus, the studied complexes consisted of three nucleosides. Models including water molecules will be discussed at relevant places in the Results and Discussion. The model representing the P-motif was taken from PDB file 1S72, which is a better refined file for the large subunit of H.m.8b It included two nucleosides (rU2724 and rC2040) and two guanine bases (G2758 and G1739). The structure of all nucleosides was simplified by replacing the 5′-OH group with an H atom. Geometry optimizations were carried out at the DFT (density functional theory) level of theory using the Gaussian 03 program package.15 The density functional was built up by Becke’s threeparameter exchange16 and Lee-Yang-Parr’s correlation functional (abbreviated as B3LYP).17 The 6-31G** basis set was used for geometry optimizations. We have shown recently that the B3LYP/6-31G** optimized structures compare quite well with reference RIMP2/cc-pVTZ data and are entirely sufficient for the subsequent high-quality interaction energy calculations.18 Gas-phase optimized geometries are depicted in Figures 3-5. Interaction Energies. Interaction energies were computed on the B3LYP optimized structures using the RIMP2 (resolution of the identity second-order Moeller-Plesset) approach combined with a large diffuse aug-cc-pVDZ basis set of atomic orbitals internally stored in the Turbomole code.19 The same approach was used in our former study,4a where we have shown that the aug-cc-pVDZ basis set for base pairs gives almost as good energies as the aug-cc-pVTZ basis set and is sufficiently close to the basis set limit.18 The RIMP2 method for calculating interaction energies has been validated in refs 18 and 20 and provides identical numbers as the full MP2 procedure. The level of calculations used in this study is thus substantially better than in the majority of the quantum mechanical (QM) studies published for NA base pairs so far and is entirely sufficient for our purpose, as we do not intend to perform the highest-accuracy reference calculations. Let us note that recently a very effective semiempirical quantum model was developed for hydrogenbonded nucleic acid base pairs, which provides a variety of accurate molecular properties with much lower computational expenses than ab initio or density functional methods.21 Significant improvement was also recently achieved in the development of accurate dispersion-augmented DFT methods that are suitable for a wide range of molecular interactions including base pairs.22 The total interaction energy of a nucleoside trimer [ABC] ∆EABC is defined as
∆EABC ) EABC - EA - EB - EC
(1)
Figure 3. Gas-phase optimized geometries (stereoview) of the A-minor contacts computed at the Becke3LYP level of theory. (a) Type I, (b) type II, (c) type III, and (d) type 0 A-minor patterns.
where EABC is the electronic energy of the whole system (in its actual geometry), and EA, EB, and EC are the electronic energies of the isolated subsystems A, B, and C. This means that ∆EABC is the difference between the energy of the trimer and the energies of the three monomers when separated into infinity. Similarly, the total interaction energy for a tetramer [ABCD] ∆EABCD consisting of four subsystems (e.g., two nucleosides and two nucleobases) denoted as A, B, C, and D can be obtained as follows:
∆EABCD ) EABCD - EA - EB - EC - ED
(2)
and the total interaction energy for a pentamer [ABCDE] ∆EABCDE can be computed as follows:
∆EABCDE ) EABCDE - EA - EB - EC -ED - EE
(3)
Pairwise interaction energies between subsystems X and Y (∆EXY) were computed for selected nucleoside-nucleoside, nucleoside-nucleobase, nucleoside-base pair, and base pairbase pair contacts using their geometries in the optimized full complexes. In this case, X and Y may formally include more than one monomer. ∆EXY can be obtained according to the following formula:
∆EXY ) EXY - EX - EY
(4)
Leading RNA Tertiary Interactions
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9157
Figure 4. Gas-phase optimized geometries (stereoview) of the water-assisted type II (a), type III (b), and type 0 (c) A-minor interactions computed at the Becke3LYP level of theory.
Figure 5. Optimized structure of the P-motif (stereoview) computed at the Becke3LYP level of theory.
where again EXY is the electronic energy of the XY dimer while EX and EY are the energies of the isolated subsystems X and Y. The interaction energy (∆EMP2) has two components: the Hartree-Fock (HF) term (∆EHF) and the electron correlation term (∆Ecorr).
∆E
MP2
) ∆E
HF
+ ∆E
corr
recently for canonical DNA base pairs.23c Nevertheless, SAPT decomposition results in a set of large mutually canceling contributions where the sum of the absolute values of all the individual contributions exceeds the total interaction energy almost by an order of magnitude. Although such calculations are insightful, it is not yet clear how to practically utilize the information obtained for any straightforward interpretation of the interactions or verification/parametrization of the force fields. In addition, the size of our systems is much larger and precludes utilization of the computer-demanding SAPT method. All interaction energies are corrected for the basis set superposition error (BSSE) using the standard counterpoise procedure24 but do not include deformation energies. The main reason for this is the substantial structural alteration of the sugar-base segment upon base pairing in many RNA base pairs. Thus, a direct inclusion of monomer deformations into the interaction energies would rather bias the results and would not be very relevant to interactions in folded RNAs. For instance, the energy investment needed to deform the gas-phase optimized geometry of isolated adenosine to that observed in the type I A-minor interaction is about 13 kcal/mol that is associated with the C3′-endo/C2′-endo conformational switch and the concomitant break of the internal H-bond between 2′-OH and N3(A). Such large overall rearrangements are not related to the direct strength of the interactions. For a detailed discussion regarding the role of the deformation energies in base pairing calculations, see ref 18, where we also explain why a formal inclusion of the deformation energy term into the interaction energy calculation is entirely inappropriate for complex H-bonded systems. It is to be emphasized that QM studies of complex RNA interaction patterns (with multiple and often non-coplanar H-bonds, flexible torsional angles, possible water insertion, possible spurious intramolecular H-bonds in the monomers, possible closely spaced substates, etc.) require different computational procedures to be adopted as compared to studies of simple base-base H-bonded systems. Deformation energies of the monomers (Table S1) along with the corresponding optimized geometries are available in the Supporting Information. Continuum Solvent Calculations. Geometry optimizations were carried out with Gaussian 0315 using the COSMO continuum dielectric method25 to mimic the aqueous environment ( ) 80.0). The united atom topological model was used to define the atomic radii. Optimizations were started from the gas-phase optimized geometries. Pairwise interaction energies in solution (∆EsolMP2) were computed by adding a solvation correction term to the BSSEcorrected gas-phase interaction energies (∆EMP2). The correction term (δMP2) was obtained as a difference of RIMP2 interaction energies computed with COSMO and in the gas phase without the BSSE correction:
∆EsolMP2 ) ∆EMP2 + δMP2
(6)
(5)
The ∆EHF term includes mainly the electrostatic interaction energy, short-range exchange repulsion, and polarization/chargetransfer contributions. The ∆Ecorr term is dominated by the dispersion attraction and further includes the electron correlation correction to the electrostatic energy. The latter correction is mostly repulsive since the electron correlation reduces the dipole moments of the monomers. It is fair to note that there exists more detailed QM-based decomposition schemes.23 However, the common problem of decompositions is their ambiguity. One advanced approach is the symmetry adapted perturbation theory (SAPT) applied
RIMP2/COSMO interaction energies in solution were computed with the Turbomole code.19 Molecular Mechanics. Molecular mechanics calculations have been performed using the AMBER726 program package in combination with the Cornell et al. force field parameters (parm99).14,27 AMBER atomic charges have been derived using the HF approximation with the 6-31G* basis set of atomic orbitals via the RESP fitting procedure.28 For our calculations, the original AMBER atomic charges were slightly modified to keep the monomers neutral. For the nucleobase, the sugarphosphate segment was replaced by hydrogen, and its charge was set to keep the base neutral. For the nucleoside, the
9158 J. Phys. Chem. B, Vol. 111, No. 30, 2007
Sˇ poner et al.
neutralization was carried out by smearing the excessive charge over several sugar atoms, resulting in negligible charge modifications of 0.01-0.02 e. We have carried out test calculations with additional slightly modified charge redistributions, which show that the results are not sensitive to such minor changes of the charge distributions. The charges used in our computations are listed in the Supporting Information. Geometry optimizations were carried out starting from the gas-phase optimized structures. Interaction energies have been obtained from single-point calculations using the QM and AMBER optimized geometries according to eqs 1-4 and do not include correction for the deformation energies. The AMBER deformation energies are rather negligible as compared to the QM values.18 Results and Discussion Type I A-Minor Interaction. The type I contact is expected to be the most stable A-minor interaction based on its H-bonding pattern. Indeed, type I packing represents the most compact and conserved form of the known A-minor contacts. Our model structure is stabilized by an H-bond between N3 of adenine and N2 of guanine, another one between O2′ of guanine and N1 of adenine, and the sugar-sugar H-bonding. This tertiary interaction is in fact a base triplet with a transSE/SE2 contact between rA and rG and a cis-SE/SE contact between rA and rC. The remaining interaction is obviously the cis-WC/WC (i.e., canonical) interaction between G and C within the canonical helix, which is not part of the tertiary contact. Note that the main emphasis of this paper is on the tertiary interactions between the A-minor adenines and the WC base pair, as this tertiary interaction is important for RNA folding. Note also that although the studied arrangement is the most common one for the A-minor I interaction, the GC base pair can be inverted to CG and less frequently also replaced by AU or UA. These isosteric A-minor I interactions were not studied in the present paper. Upon geometry optimization, the adenosine deviates from the plane of the WC pair. Nevertheless, this structural change does not invoke any alteration of the crystallographically suggested H-bonding pattern. The optimized geometry is depicted in Figure 3a. An overlay of the optimized and crystal geometries is presented in Figure 6a. Key interatomic distances between H-bond donor and H-bond acceptor atoms are very similar in the gas-phase optimized and crystal structures. For example, the O2′(rC)‚‚‚O2′(rA) distance is 2.83 and 2.74 Å, the N2(rG)‚‚‚N3(rA) distance is 3.07 and 3.19 Å, and the O2′(rG) ‚‚‚N1(rA) distance is 2.73 and 2.72 Å in the crystal and gas-phase optimized geometries, respectively. Considering the medium to low resolution of the ribosomal X-ray structures and the fact that there always are subtle systematic differences between X-ray and QM H-bond lengths (for a variety of reasons and depending on the method),29 the experimental and computed H-bond lengths can be considered as identical. Other characteristic H-bonding contacts are listed in Table 1. Note that A-minor I interactions are substantially variable and even dynamical,6,9,13 and thus, our optimized structure is entirely representative. Interaction energies computed at the MP2 level and with the AMBER force field are listed in Table 2. The adenine is firmly bound to the double helical GC base pair, the pairwise interaction energy term computed for the A‚‚‚GC contact being -14.1 and -31.3 kcal/mol at the HF and MP2 levels, respectively. The individual tertiary cis-SE/SE rA/rC and transSE/SE rA/rG contacts contribute -17.0 and -15.2 kcal/mol, respectively. (For the sake of completeness, the total rA‚‚‚rG‚‚‚rC
Figure 6. Overlay of the crystal (black) and gas-phase (red) optimized geometries for selected A-minor patterns and P-motif. (a) Type I, (b) type II, and (c) type II water-mediated, (d) type 0 A-minor interaction, and (e) P-motif.
trimer MP2 interaction energy is -60.3 kcal/mol, but almost half of this trivially comes from the GC WC base pair.) The very large (-17.2 kcal/mol) correlation contribution to the interaction energy (∆Ecorr) of the tertiary contact implicitly suggests that the stabilization could even be dominated by the dispersion energy (see Computational Methods). It clearly contrasts the balance of forces in common base pairs. For comparison, the correlation contribution of the GC WC base pair interaction energy is only -4 kcal/mol, while the HF component is -25 kcal/mol. The overall balance of forces for the A-minor I interaction is thus somewhere between planar base pairing and base stacking, while the overall interaction energy weakly exceeds the intrinsic stability of the GC WC base pair. Notably, while the rA/rG contact is clearly correlation (dispersion)-controlled, the rA/rC contact is dominated by the HF term and thus more electrostatic (see Table 2). In other words, the A-minor I interaction is composed of two contacts where one of them is more hydrophobic (rA/rG) and the other more hydrophilic. This may be one of the evolutionary advantages of A-minor I tertiary interactions in RNA. Note that the interior of the ribosome is substantially hydrated, which is fundamentally different to the protein interior.30 Thus, many RNA tertiary contacts are directly exposed to interactions with solvent. We suggest that the capability of the A-minor interaction to utilize both electrostatics and dispersion forces contributes to its structural adaptability. Particularly, we assume that the electron correlation-controlled A‚‚‚G contact of the A-minor type I interaction is the key zipping interaction, while the electrostatic A‚‚‚C (i.e, the cis-SE/SE) contact is capable to
Leading RNA Tertiary Interactions
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9159
TABLE 1: Main Geometrical Parameters of H-Bonding Contacts in Fully Optimized Models of A-Minor and P-Motifsa motif A-minor, type I
contact
donor
acceptor
distance (Å)
trans-SE/SE rA/rG
O2′(rG) N2(rG) O2′(rC) O2′(rA) N2(rG) N1(rG) N4(rC) O2′(rA) O2′(rC) N2(rG) N1(rG) N4(rC) N2(rG) N6(rA) O2′(rA) O2′(rC) N2(G) N1(G) N4(rC) O(W) O(W) N2(G) N6(A) N6(rA) N3(rU) O2′(rA) N6(rA) N3(U) O(W1) O(W2) O2′(rA) O(W2) O(W1) O2′(rA) O2′(rA*) N6(rA) N3(rU) N6(rA) N3(rU) O2′(rA) O2′(rA*) O(W) O(W) N3(rU) N1(G) N2(G*) N1(G*) N4(rC) O2′(rU) O2′(rC) N2(G)
N1(rA) N3(rA) O2′(rA) O2(rC) O2(rC) N3(rC) O6(rG) O2′(rG) N3(rA) O2(rC) N3(rC) O6(rG) N1(rA) O2′(rG) O2′(rC) N3(rA) O2(rC) N3(rC) O6(G) N3(G) N1(A) O(W) O(W) O4(rU) N1(rA) N1(rA*) O4(U) N1(rA) N1(rA*) N3(rA) O(W2) O(W1) O2′(rA) N3(rA*) O2′(rA) O4(rU) N1(rA) O4(rU) N1(rA) O2′(rA*) O(W) O3′(rA*) N3(rA) O6(G) O2(rU) O2(rC) N3(rC) O6(G*) O2(rC) O2′(rU) O2′(rC)
2.72 3.19 2.74 2.72 3.02 2.97 2.76 2.69 2.72 2.88 2.95 2.86 3.02 3.12 2.81 2.78 2.97 2.94 2.82 2.97 2.80 2.99 2.96 2.93 2.83 2.75 2.93 2.85 2.79 2.89 2.63 2.85 2.93 2.83 2.74 2.93 2.84 2.92 2.84 2.73 2.66 2.83 2.84 2.81 2.90 3.02 2.99 2.77 2.70 2.77 2.93
cis-SE/SE rA/rC cis-WC/WC rG/rC A-minor, type II
cis-SE/SE rC/rA cis-WC/WC rG/rC rG‚‚‚rAb
A-minor, type II water-mediatedc,d
cis-SE/SE rC/rA cis-WC/WC G/rC G‚‚‚W‚‚‚rAb
A-minor type IIIe
cis-WC/WC rA/rU
A-minor type III water-mediatedd-f
rA‚‚‚rA*b cis WC/WC rA/U
A-minor type 0e
W1‚‚‚rA* W2‚‚‚rA rA‚‚‚W2 W2‚‚‚W1 W1‚‚‚rA rA‚‚‚rA*b cis-WC/WC rA/rU
A-minor type 0 water-mediatedd,e
cis-WC/WC rA/rU rA‚‚‚rA*b rA*‚‚‚W
P-motifg
rA‚‚‚W cis-WC/WC G/rU cis-WC/WC G/rC rU‚‚‚rCb G‚‚‚rCb
a Optimized Becke3LYP/6-31G** geometries are shown in Figures 3-5. Note that for SE/SE interactions, X/Y pairing is not equivalent to Y/X pairing: see text. Cartesian coordinates of the optimized structures are listed in the Supporting Information. b This contact cannot be assigned according to the classification of Leontis et al.2b c As the ribose part of rG does not interact with the remaining part of the complex, in this structure, rG was represented with G. d W, W1, and W2: water molecules. e rA and rA* refer to the adenosines from the double helix and single strand, respectively. f As the ribose part of rU does not interact with the remaining part of the complex, in this structure, rU was represented with U. g G is involved in a wobble base pair with rU, and G* is involved in a cis-WC/WC pair with rC.
communicate with solvent or individual water molecules. Our recent study demonstrated a functionally important dynamical water insertion seen between A and C in the A-minor type I interaction of RNA kink turns 3-D motifs as well as water insertion seen in some X-ray structures of A-minor type I interactions elsewhere in the ribosome. We suggest that the previously described balance of forces could be the basic physical chemistry reason for the observed position of water insertion into the A-minor I interaction. (See ref 13 for more structural details about water insertion into A-minor I interactions.) The previous data were obtained in the gas phase. To provide a rough estimate of solvent screening, we have reoptimized the gas-phase A-minor I model using the COSMO continuum
solvent method. The optimization converged within ca. 30 steps, resulting in a geometry that is almost identical to the gas-phase optimized structure. We then calculated the rA‚‚‚rCrG interaction energy for this optimized structure at the RIMP2 level including the effect of the dielectric continuum. Note that such calculations still do not provide complete free energies of binding that could be directly compared with experimental free energies, mainly because the solute entropy terms and the dynamics are not included. The calculations nevertheless give a fair qualitative assessment of the magnitude of solvent screening of the electrostatic interactions. As expected, the aqueous environment mainly affects the HF term since it diminishes the electrostatic part of the interaction. Thus, the HF term is reduced from -14.1 to -1.7 kcal/mol. On the
Sˇ poner et al.
9160 J. Phys. Chem. B, Vol. 111, No. 30, 2007
TABLE 2: Selected Pairwise Interaction Energy Terms (kcal/mol) Computed at RIMP2/aug-cc-pVDZ Level of Theory and with AMBER Force Field for the Studied A-Minor and P-Motifs tertiary interaction energiesa interactionb A-minor type I A-minor type II A-minor type II, water-mediated A-minor type III, water-mediated A-minor type 0 A-minor type 0, water-mediated P-motif
tertiary interactions
∆EMP2
∆EHF
∆Ecorr
∆EAMBER
∆EVDW
rA‚‚‚rGrCc rA‚‚‚rG (trans SE/SE) rA‚‚‚rC (cis SE/SE) rA‚‚‚rGrCc rC‚‚‚rA (cis SE/SE) rG‚‚‚rA rA‚‚‚WGrCc rA*‚‚‚WrAUc,d rA*‚‚‚rArUc,d rA‚‚‚rA*d rA*‚‚‚rUd rA*‚‚‚WrArUc,d rUG‚‚‚G*rCe rU‚‚‚rC
-31.3 -15.2 -17.0 -26.4 -18.6 -9.4 -29.1 -12.0 -18.5 -18.5 -0.1 -17.0 -25.0 -16.9
-14.1 -4.8 -10.4 -10.3 -8.1 -4.0 -15.4 -6.0 -7.9 -7.8 0.0 -6.4 -11.6 -10.3
-17.2 -10.4 -6.6 -16.1 -10.5 -5.4 -13.7 -6.0 -10.6 -10.7 -0.1 -10.6 -13.4 -6.6
-31.9 -15.1 -16.8 -26.8 -17.7 -9.1 -26.4 -12.6 -17.7 -17.7 0.0 -18.8 -29.1 -17.3
-2.9 -2.9 0.1 -2.2 -0.4 -1.8 0.8 -0.4 -3.0 -3.0 0.0 -0.3 -4.4 -2.0
a ∆EMP2 and ∆EAMBER: interaction energies obtained by MP2 and molecular mechanics methods, respectively. ∆EHF and ∆Ecorr: HF and electron correlation components of ∆EMP2 interaction energy (eq 5). ∆EVDW: van der Waals component of ∆EAMBER. b Optimized geometries were obtained at the B3LYP/6-31G** level of theory and are depicted in Figures 3-5. Coordinates are listed in the Supporting Information. Full tertiary contacts are in bold. c Interaction energy between the A-minor adenosine and the rest of the intermolecular complex. Water (W) molecules are formally included into the WC base pairs. d Asterisk denotes the A-minor (tertiary, protruding) adenosine. e G belongs to the wobble base pair with rU, while G* belongs to the cis-WC/WC base pair with rC.
contrary, the rather significant correlation component does not change upon addition of the solvent continuum being -17.2 and -18.7 kcal/mol in the gas phase and continuum solvent, respectively. A similar behavior was observed also for the WC GC pair, whose HF interaction energy dropped down from -25.5 to -4.9 kcal/mol due to solvation, while the otherwise quite weak correlation stabilization slightly improved from -3.9 to -7.4 kcal/mol, respectively. (This latter effect may reflect the reduction of the repulsive intramolecular electron correlation term; see the Computational Methods for an explanation.) In other words, the inclusion of solvent screening effects fully confirms the difference in the balance of forces in the RNA tertiary interactions and standard base pairs, as deduced from the gas-phase data. Evidently, the stability of the canonical GC base pair is affected (reduced) by the continuum solvent screening considerably more than the stability of the A-minor interaction (17 vs 11 kcal/mol). The COSMO calculations thus justify our gas-phase approach used throughout the paper. Note also that most RNA tertiary interactions occur inside compact ribonucleoproteins where the exposure to solvent is much smaller as compared to our model COSMO calculations (i.e., the interactions are formed somewhere between the gas phase and fully solvated conditions). In addition, we have recently extensively characterized the structural dynamics of A-minor and P-interactions using classical explicit solvent simulations,5a,7b,13,31 so although the bulk of our present data is based on gas-phase analyses, we are quite convinced that our interpretation of the gas-phase data is entirely valid. Type I A-Minor Water-Inserted Interaction. We have noticed that the A-minor I contact may be water-mediated, as revealed by MD simulations and X-ray database analysis. To illustrate the effect of water insertion, we studied a typical watermediated type I A-minor contact present in kink turn 38 of helix 38 of the large ribosomal subunit (see Figure 7 and ref 13, a representative MD snapshot was taken for the computations). The system may formally be divided in several ways: as rA‚‚‚[rGrC + W], [rA + W]‚‚‚rGrC, and W‚‚‚[rA + rGrC] interactions. In this structure called open or water-mediated A-minor I, the water bridge spans between O6(rC) and O2′(rA), and the water molecule is simultaneously a donor to both nucleotides. Our attempt to QM optimize this structure failed due to structural deformations. Thus, we took just a random snapshot from MD simulations. The computed
Figure 7. MD geometry used for evaluating the intermolecular interactions of the water-mediated type I A-minor contact.13
rA‚‚‚[rGrC + W] MP2 interaction energy is -12.7 kcal/mol that comes almost exclusively from the correlation component, amounting to -11.1 kcal/mol. The interaction energy between the rA + W and rGrC subsystems is -1.3 and -13.2 kcal/mol at the HF and MP2 levels, respectively. Finally, the energy of the W‚‚‚[rA + rGrC] water-binding interaction is +0.4 and -5.0 kcal/mol at the HF and MP2 levels, respectively. This shows that the system is stabilized by the rA/rG trans-SE/SE pairing alone and that the gas-phase analysis does not suggest that this water molecule is directly stabilizing the open geometry over the closed one. This observation is analogous to that of Su¨hnel et al., who have shown before that water bridges in which the water molecules act either as double donors or as double acceptors do not energetically stabilize the water-assisted base pairs in the gas phase.32 The results mean that the open waterinserted A-minor I substate (seen in simulations as well as X-ray structures) needs to be considered within the context of the local RNA architecture. The A-minor I opening and water insertion are simultaneous complementing events that (within the solvated RNA structure) result in structural flexibility to the A-minor I geometry, which can adopt a range of isoenergetic geometries in RNA. This is exactly what simulations revealed for A-minor I interactions in RNA kink turns and leads to the unique and functionally important elbow-like flexibility of this recurrent RNA building block.13 Type II A-Minor Interaction. The crystal geometry exhibits a tertiary cis-SE/SE contact between rC and rA obviously supplemented by a standard GC base pair. Geometry optimization resulted in almost the same parameters for the cis-SE/SE
Leading RNA Tertiary Interactions rC/rA contact as observed in the crystal structure, the O2′(rA)O2′(rC) distance being 2.64 and 2.69 Å and the O2′(rC)-N3(rA) distance being 2.73 and 2.72 Å in the crystal and gas phase, respectively (see Figure 3b and Table 1). Note that the cis-SE/ SE rA/rC and cis-SE/SE rC/rA base pairs are not equivalent and the standard convention is that the nucleotide listed in the first place interacts via its 2′-hydroxyl group with the base of the second nucleotide (cf. Figure 2a,b). The gas-phase optimization, however, rearranged the other side of the interaction pattern and brought about three additional contacts between N6(rA) and O2′(rG), O2′(rG) and N3(rG), as well as N2(rG) and N1(rA). This H-bond formation was accompanied with a C3′-endo to C2′-endo conformational switch of guanosine (note that C3′-endo is the common sugar pucker in RNA). This rearrangement is clearly an artifact of optimizing an isolated trimer in the gas phase. We carried out a statistical analysis of the type II RNA interaction motifs in 23S rRNA from the 50S subunit of H.m. with the FR3D program.33 More than 100 candidates were found for this interaction, but after a visual inspection, only about 40 structures were considered to be true A-minor type II interactions. Among these, only three were C2′-endo in their rG nucleotides, which is about 7% of the total A-minor type II interactions occurring in the H.m. 50S crystal structure. The abundance of the C2′-endo conformation in the whole H.m. 50S subunit is about 14%, suggesting that rG in the type II A-minor binding pattern has a reduced propensity to adopt the C2′-endo conformation as compared to a common nucleotide in the ribosome. This is not surprising as this rG participates in canonical A-RNA segments. Thus, it is very likely that our C2′-endo/C3′-endo switch is just an artifact of the gas-phase calculations. Further, in the same search, we found that only five A-minor II interactions had an N6(A)O2′(G) distance below 6 Å with the closest contact being 4.67 Å. In other words, we found no example of the A‚‚‚G interaction obtained using the gas-phase optimization. It is thus obvious that while the gas-phase optimization did not affect the cis-SE/SE rA/rC interaction, it at the same time resulted into an artificial contact between rA and rG, caused mainly by the conformational change of the guanosine sugar. Similar to the type I contact, in the optimized geometry of the type II contact, the adenosine is tilted toward the other two bases by about 75° (see Figure 6b). This structural rearrangement optimizes the H-bonding contact between N2(rG) and N1(rA) as well as the H-bond donated by N6(rA) to O2′(rG). Let us again mention that the latter contact was not found in the starting crystal geometry and is the result of the relaxation of the isolated interaction. On the other hand, it certainly also reflects the overall flexibility of A-minor interactions. The pairwise interaction energy term between the adenosine and the rest of the computed gas-phase model (the tertiary contact) is -10.3 and -26.4 kcal/mol at the HF and MP2 levels, respectively. The dominance of the correlation (dispersion) stabilization is again very evident. In contrast to the A-minor I interaction where both tertiary base pairs are almost equally stable (albeit with a different balance of forces), A-minor II consists of a strong rC/rA and weaker rA/rG base pair (Table 2). This reflects the difference between the position of the adenine as compared to the WC base pair in both A-minor types. Note again, in addition, that while the rC/rA interaction is stable in the gas-phase optimization, the rA/rG contact is basically absent in the crystal structure. Thus, it is more appropriate to represent the A-minor II contact by the dominating rC/rA interaction, which has an MP2 interaction energy of -18.6 kcal/
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9161 mol, composed of HF and correlation terms -8.1 and -10.5 kcal/mol, respectively. Type II A-Minor Water Inserted. The previous structural optimization resulted in a stable A-minor II interaction but with artificial contact between N6(rA) and O2′(rG). This contact is not seen in the crystal structure where actually a water molecule is present in this area.6b,7d Thus, we inserted a water molecule at its crystallographically suggested position between rA and G and carried out a geometry optimization. The troubling ribose part of rG was not included in this particular model as it did not participate in intermolecular H-bonds in the crystal structure. (This sugar is fixed in RNA by its participation in the double helix.) The resulted structure (Figure 4a) is visibly closer to the crystal geometry than the one without water (compare Figure 6b,c). The water bridge prevents the direct H-bond formation between A and G and thereby fixes the almost coplanar mutual position of the three constituting bases. In addition, as a result of the water insertion, the pairwise interaction energy between the adenosine and the rest of the model (rA‚‚‚GrC + W) is improved by 5.1 and 2.7 kcal/mol at the HF and MP2 levels, respectively (see Table 2).34 The interaction energy between rA and rC is -11.3 and -19.9 kcal/mol at the HF and MP2 levels, respectively. These values agree well with those calculated for the isolated rC/rA cis-SE/SE base pair at the same theoretical levels (-9.2 and -20.0 kcal/mol).4c Thus, the improvement of the rA‚‚‚GrC + W pairwise interaction can be attributed to the water insertion, which also is reflected by the binding energy of the water molecule to the rest of the system, amounting to -7.1 and -13.4 kcal/mol at the HF and MP2 levels. We conclude that the A-minor II A/CG interaction is water mediated and that the water insertion bolsters the weaker tertiary interaction (i.e., G/A). Although the water may visually look as if it is just hydrating the base pair from outside, it actually mediates the interaction, considerably more than in the case of the water-inserted substate of A-minor I. In addition, in our recent molecular dynamics simulations, we observed that the hydration site can be dynamically shifted into the primary cisSE/SE interaction, causing a further opening of A-minor II (cf. Figure 7b in ref 5a). This substate was not investigated in the present paper. Type III A-Minor Contact. The type III contact represents the least conserved A-minor interaction and is expected to be the weakest. On the basis of crystal geometries, only one tertiary H-bonding contact is expected between the two adenosines donated by O2′ of the adenosine from the double helix (see Figure 2c). The crystal structure shows that there are two possible H-bond acceptors, N1 or N3 of the tertiary (single strand) rA (hereafter denoted as rA*). Nevertheless, they are much too far to be able to form an H-bonding contact (the O2′N1 and O2′-N3 distances are 3.9 and 4.2 Å, respectively). In fact, optimization of the crystal geometry in the gas phase favored the H-bond between O2′ and N1 that led to a shrinkage of the O2′-N1 distance by 1.15 Å. The optimized structure is depicted in Figure 3c. However, we do not consider the gasphase optimized geometry as a true representation of the crystal situation because formation of the O2′-H‚‚‚N1 H-bond implies a fairly strong deformation of the crystal structure. This deformation manifests itself in (i) a much too large shrinkage of the O2′-N1 distance and (ii) rotation of the rA* unit by ca. 60° about the axis of its six-membered ring. Instead, we considered a model that included two additional water molecules bridging the N3 of rA and N1 of rA* (Figure 4b). The initial position of the added water molecules was taken
9162 J. Phys. Chem. B, Vol. 111, No. 30, 2007 from the crystal of H.m.8 To simplify the model, we deleted the ribose part of rU. This ribose ring did not form intermolecular contacts with the rest of the trimer in the fully optimized water-free model. Insertion of the water molecules impeded the direct H-bond formation between the adenosines, and accordingly, the N1-N3 interbase distance between the two adenosines (6.21 Å) is close to the crystal value (6.60 Å). Thus, our computations suggest that structural water molecules are indispensable at stabilizing the type III A-minor interactions. Geometrical data characterizing the H-bonding network of the water-mediated A-minor interactions are also listed in Table 1. The latter structure was considered for evaluating the strength of the intermolecular contacts. The pairwise interaction energy computed for the tertiary contact between rA* and the rest of the water-mediated type III model (formally the rA*‚‚‚[waters + AU] dimer calculation) is -6.0 and -12.0 kcal/ mol at the HF and MP2 levels, respectively (see Table 2). This is indeed the lowest stabilization among all studied A-minor model systems. The rather rare occurrence of the somewhat elusive A-minor III contacts prevented us to fully assess the eventual plausibility of the first geometry obtained without the water molecules. Type 0 A-Minor Contact. Structurally, the type 0 A-minor contact is quite different from the previously discussed interaction types. While in the type I-III contacts the adenines from the single stranded segment are roughly coplanar with the bases of the receptor double helical stem, in the type 0 contact, the adenine lies in the plane perpendicular to that of the other two bases (see Figure 2d). This contact thus enables a completely different spatial orientation of the interacting RNA segments. In our particular example, the canonical WC/WC base pair is A-U, and the tertiary interaction is due to a ribose-ribose H-bond formed between the 2′-OH groups of the adenosines and an additional H-bond between 2′-OH of rA and N3 of rA* (rA and rA* refer to the adenosines from the double helix and single stranded loop, respectively). The optimized geometry is presented in Figure 3d, while the overlay of the computed and crystal structures is shown in Figure 6d. The interaction energy between rA* and the cis-WC/WC rA/rU pair is -7.9 and -18.5 kcal/mol, at the HF and MP2 levels, respectively (see Table 2). Although the type 0 A-minor motif is intrinsically stable, we have also considered a water-assisted variant for this contact. According to the X-ray geometry, one water molecule may mediate the interaction of the adenosines. In the fully optimized geometry of the complex, we have recognized an H-bond donated by the terminal 3′-OH of one of the adenosines that is an obvious artifact of the gas-phase calculations. To eliminate this contact, we carried out an additional optimization of the 3′-methylated analogue of the same complex. In the optimized geometry (see Figure 4c), the water molecule was stabilized near to its crystallographic position bridging O2′ of rA* with N3 of rA. Inclusion of the water molecule improves neither the agreement between the gas-phase and X-ray geometries nor the binding energy between rA* and the double helical stem (MP2 pairwise interaction energies between rA* and the rest of the type 0 models are -17.0 and -18.5 kcal/mol, with and without inclusion of water, respectively; see Table 2). Therefore, the calculations do not unambiguously suggest an active participation of structural water molecules in the stabilization of the type 0 A-minor contacts. The interaction looks hydrated, but there is no apparent trend to form well-defined water bridges. P-Motif. While the A-minor interactions typically form contacts of a single stranded RNA loop with a canonical double helix, P-motifs connect two double helical stems.7a,b Our P-motif
Sˇ poner et al. model system is composed of canonical rC/G and wobble rU/G base pairs H-bonded mainly via the 2′-OHs of the pyrimidine nucleosides. This may be considered as the canonical P-interaction.7b The fully optimized geometry of the P-motif (see Figure 5) closely resembles that of the crystal. An overlay of the optimized and crystal geometries is presented in Figure 6e. It is stabilized by a complex H-bonding network whose components along with the characteristic interatomic distances are listed in Table 1. The tertiary P-motif rCG‚‚‚rUG interaction energy is -25.0 kcal/mol at the MP2 level (see Table 2). The H-bonds formed between the sugar edges of the centrally positioned pyrimidine nucleosides contribute -16.9 kcal/mol to this stabilization and represent the leading tertiary base pair in the P-interaction. (The total interaction energy, i.e., with inclusion of the two intra-helical base pairs, is -70.7 kcal/mol.) Again, the correlation component to the stabilization is larger than the HF component, which evidently is a common feature of all ribose zippers. It certainly is important to protect the tertiary contacts from being disrupted by the solvent. Interestingly, despite that the visual inspection could indicate that the P-interaction is intrinsically more stable than the A-minor motif, the energy calculations show that this is not the case, as the studied type I and type II A-minor tertiary contacts are -31.3 and -29.1 kcal/mol, respectively. Our recent database-phylogenetic and MD analysis of the P-interaction, however, also indicated that the P-interaction looks to be particularly firm.7b These findings are not mutually contradicting. All these interactions are intrinsically very stable with a large dispersion stabilization. Their structural stability in folded RNAs is further substantially affected by their overall shapes (stericity), which is obviously different for A-minor I/II and P-interaction tertiary contacts. A-minor and P-interactions join different segments and thus have different roles in folded RNAs, which means that they do not mutually compete in the course of evolution. Performance of AMBER Force Field. We tested the performance of the AMBER force field at calculating interaction energies for tertiary interactions in RNA (see the data in column 5 of Table 2). The maximum deviation was found for the P-motif, where AMBER overestimates the MP2 rCG‚‚‚rUG pairwise interaction energy by 16%. The reason for this rather large deviation might be that the four ribose OH groups form a very complicated H-bonding network, which might be difficult to fully capture with standard force field approaches. Also, the fact that the AMBER charges are derived at the HF level can contribute to the difference (see Discussion in ref 18). Note, however, that the key rC‚‚‚rU contact is reproduced correctly. The second largest deviation between MP2 and AMBER data was observed for the pairwise interaction energy between adenosine and the subsystem comprising the water molecule plus the cis-WC/WC rA/rU pair in the water-assisted type 0 contact. Here, the AMBER value (-18.8 kcal/mol) overestimates the one obtained at the MP2 level (-17.0 kcal/mol) by 11%. Similarly, AMBER underestimates the MP2 binding energy of the adenosine in the water-mediated type II contact by 9%. These differences are not large and perhaps could be caused by the presence of a structural water molecule in these binding patterns, described by the TIP3P water potential. All other interaction energies calculated with MP2 and AMBER agree within 8% of the MP2 value. Despite these differences, the force field looks robust, considering the complexity of the interactions. We also attempted to reoptimize the QM geometries with the AMBER force field. Except for one (water-mediated
Leading RNA Tertiary Interactions A-minor type 0), the optimized geometries were very similar to those obtained by QM. The same trend is reflected by the interaction energies computed for the AMBER optimized structures. For a comparison of the AMBER interaction energies obtained on AMBER and QM optimized structures (Table S2), as well as the structure of AMBER optimized minima, refer to the Supporting Information. Conclusion RNA base pairing principles are strikingly different from those in DNA molecules. We have carried out the first quantum chemical study on key tertiary interaction patterns in RNA (i.e., A-minor and P-interactions). The calculations were aimed at supplementing the X-ray structural information with data characterizing the stability of these ubiquitous building elements of functional RNAs. Four studied A-minor interactions were preferably utilized to form contacts between double stranded and single stranded RNA segments, while the P-interaction mediates contacts between two double helices. The A-minor interaction is formed by adenine interacting via its sugar edge with a WC base pair from a canonical double helix. The P-interaction brings together a G/U wobble base pair and a CG WC base pair. In all these extended RNA base pairing patterns, H-bonds involving the 2′-hydroxyl group of ribose (absent in DNA) are essential. The A-minor interactions have a clear steric preference for adenine making the tertiary contact as compared to any other base. We thus did not consider other bases in our calculations since the preference for adenine was well-explained in the original X-ray studies.6a,b The most conserved A-minor type I interaction preserved all the main characteristics of the crystal geometry in the gas-phase optimized structures. The adenine from the single stranded loop interacts with both nucleotides of the WC base pair. The A-minor I interaction can also exist as a partially water-mediated interaction pattern with water insertion in the cis-SE/SE base pair. Static crystal structures often show this water insertion, while our recent MD simulations demonstrated dynamical cisSE/SE water insertions in A-minor type I interactions regulating the dynamics of the interacting RNA segments.13,31 The present calculations, however, suggest that the water insertion is not stabilizing A-minor I; its role is to provide structural flexibility to the interaction. This structural flexibility is then of functional importance, for example, in RNA kink turns.13,31 The other highly conserved A-minor type II (A/CG) is deformed upon gas-phase optimization. However, upon inclusion of a water molecule, the X-ray structure is excellently reproduced. We thus conclude that A-minor II clearly prefers participation of a water molecule when the primary interaction is between adenine and cytosine (canonical A-minor motif). Although visually the water molecule looks to bind a posterior to a formed tertiary base triplet, the calculations show that the water is substantial in shaping up the interaction, specifically the contact between the tertiary adenine and the guanine in the trans position. Only after inclusion of this water molecule does the adenine recognize both nucleotides of the WC base pair, making then the A-minor II interaction similar to the A-minor I interaction. Without the water participation, the adenine could reach the trans base of the double helical stem only upon a significant local distortion of the stem. In our recent simulations of the hepatitis delta virus ribozyme,5a we further detected substates with even a deeper penetration of a water molecule into an A-minor type II interaction. (cf. Figure 7b in ref 5a). The adenine from the single stranded stem interacts only with one nucleotide of the WC base pair in the case of A-minor III
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9163 and 0 motifs. The A-minor III interaction is particularly weak and is considered to be the least specific and biologically important. Involvement of structural waters was needed to reproduce the crystal geometry in the gas phase. Stable structures of type 0 contact were obtained in the presence as well as in the absence of a water molecule, with both structures showing modest differences as compared to X-ray geometry. The P-interaction is a very extensive quartet contact with no water participation, and thus, its optimized geometry was close to identical to that observed in the crystal structure. Overall, the calculations achieved a meaningful structural agreement with the X-ray structures and clearly illustrated that A-minor interactions are structurally highly versatile and offer a number of geometrical substates that can be used not only to fix the RNA structures but also to mediate dynamical movements. The direct (without water mediation) A-minor I and II tertiary contacts are very stable (interaction energies of -31 and -26 kcal/mol). The electron correlation interaction energy term ∆Ecor (-17 and -16 kcal/mol) exceeds the HF interaction energy term ∆EHF (-14 and -11 kcal/mol for A-minor I and II, respectively). This means that A-minor interactions have a strikingly different balance of forces as compared to standard base pairing stabilized primarily by the HF interaction energy. In other words, A-minor interactions are considerably more influenced by the dispersion energy as compared to canonical base pairs. The increased dispersion contribution means that these interactions should be more hydrophobic and thus particularly suitable to zip the RNA structures. Also, the remaining A-minor interactions (and the P-interaction) show large electron correlation contributions. We suggest that the large dispersion stabilization is one of the key physical chemistry features (besides favorable stericity) as to why evolution so often uses these interactions as tertiary contacts in RNAs. The balance of forces in these tertiary interactions is somewhere between standard H-bonding base pairing and base stacking. The water insertion into the A-minor I contact can be wellrationalized based on the balance of the interactions in the A-minor motifs. The A-minor type I interaction includes two equally stable tertiary pairs. One of them (trans-SE/SE), however, is more dispersion stabilized and the other (cis-SE/ SE) more electrostatic. The water insertion occurs in the latter one as the water insertion better competes with electrostatic interactions. Exactly this behavior is dynamically seen in explicit solvent MD simulations and coincides with A-minor I geometries in static X-ray structures.13,31 Note that the interior of the ribosome is quite substantially hydrated, in contrast to folded proteins. Thus, dynamical interplay between direct and watermediated tertiary base pairs can be quite important for the structural dynamics of the ribosome. In the case of the A-minor II interaction, there is one strong and one weak tertiary base pair. Not surprisingly, the water insertion occurs within the weak one, and in this particular case, the water is actually needed to keep the A-minor II pattern shape (in contrast to the A-minor I interaction), at least for the canonical (A/CG) orientation of A-minor II.6b,7d The P-interaction is also very strong (-25 kcal/ mol) again with a prevalent electron correlation term (-13.5 kcal/mol). Because of the presence of structural water molecules in several studied structures, direct comparison of the total interaction energies for all studied systems is not straightforward. Nevertheless, we have noticed that the computed interaction energies are roughly proportional with the number of H-bonds present in the optimized models (i.e., the ratio of the total
9164 J. Phys. Chem. B, Vol. 111, No. 30, 2007 interaction energy and the number of H-bonds is nearly constant and ranges from -6.6 to -8.6 kcal/mol). The present results illustrate that the very basic intrinsic (gas phase) physical chemistry properties of the leading tertiary RNA interactions, such as intrinsic geometries, magnitude of gasphase interactions, and balance of the interaction energy contributions can be well-correlated with the roles of these interactions in folded RNAs.6c,35 The functional RNA molecules and ribonucleoproteins reveal a fascinating richness of biologically important base pairing patterns and other molecular interactions not observed in DNAs. Considering the enormous importance of RNA molecules and the explosion of discoveries of major biological and biochemical roles of RNAs in the last few years, the lack of interest in contemporary QM computational literature in characterizing these RNA forces is quite surprising. Acknowledgment. This contribution was supported by the Grant Agency of the Czech Republic Grants 203/05/0009 and 203/05/0388, the Grant Agency of the Academy of Sciences of the Czech Republic Grant IAA400550701, the Ministry of Education of the Czech Republic Grants LC512, AVO Z5 004 0507, and AVO Z4 055 0506, and NSF-CREST Grant HRD0318519. Supporting Information Available: Cartesian coordinates of all optimized structures depicted in Figures 3-5; optimized monomer geometries; atomic charges used in molecular mechanical calculations; Cartesian coordinates of AMBER optimized minima; deformation energies; and comparison of AMBER interaction energies obtained for DFT and AMBER optimized geometries. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Moore, P. B.; Steitz, T. A. Annu. ReV. Biochem. 2003, 72, 813850. (2) (a) Leontis, N. B.; Westhof, E. Q. ReV. Biophys. 1998, 31, 399455. (b) Leontis, N. B.; Stombaugh, J.; Westhof, E. Nucleic Acids Res. 2002, 30, 3497-3531. (3) (a) Re´blova´, K.; Sˇ pacˇkova´, N.; Sˇ tefl, R.; Csaszar, K.; Kocˇa, J.; Leontis, N. B.; Sˇ poner, J. Biophys. J. 2003, 84, 3564-3582. (b) Lescoute, A.; Leontis, N. B.; Massire, C.; Westhof, E. Nucleic Acids Res. 2005, 33, 2395-2409. (c) Leontis, N. B.; Westhof, E. RNA 1998, 4, 1134-1153. (4) (a) Sˇ poner, J. E.; Sˇ pacˇkova´, N.; Kulha´nek, P.; Leszczynski, J.; Sˇ poner, J. J. Phys. Chem. A 2005, 109, 2292-2301. (b) Sˇ poner, J. E.; Sˇ pacˇkova´, N.; Leszczynski, J.; Sˇ poner, J. J. Phys. Chem. B 2005, 109, 11399-11410. (c) Sˇ poner, J. E.; Leszczynski, J.; Sychrovsky, V.; Sˇ poner, J. J. Phys. Chem. B 2005, 109, 18680-18689. (5) (a) Krasovska, M. V.; Sefcikova, J.; Re´blova´, K.; Schneider, B.; Walter, N. G.; Sˇ poner, J. Biophys. J. 2006, 91, 626-638. (b) Rhodes, M. M.; Re´blova´, K.; Sˇ poner, J.; Walter, N. G. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 13380-13385. (6) (a) Nissen, P.; Ippolito, J. A.; Ban, N.; Moore, P. B.; Steitz, T. A. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4899-4903. (b) Lescoute, A.; Westhof, E. Biochimie 2006, 88, 993-999. (c) Battle, D. J.; Doudna, J. A. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 11676-11681. (7) (a) Gagnon, M. G.; Steinberg, S. V. RNA 2002, 8, 873-877. (b) Mokdad, A.; Krasovska, M. V.; Sˇ poner, J.; Leontis, N. B. Nucleic Acids Res. 2006, 34, 1326-1341. (c) Cate, J. H.; Gooding, A. R.; Podell, E.; Zhou, K.; Golden, B. L.; Kundrot, C. E.; Cech, T. R.; Doudna, J. A. Science 1996, 273, 1678-1685. (d) Tamura, M.; Holbrook, S. R. J. Mol. Biol. 2002, 320, 455-474. (8) (a) Ban, N.; Nissen, P.; Hansen, J.; Moore, P. B.; Steitz, T. A. Science 2000, 289, 905-920. (b) Klein, D. J.; Moore, P. B.; Steitz, T. A. J. Mol. Biol. 2004, 340, 141-177. (9) Noller, H. F. Science 2005, 309, 1508-1514. (10) Klein, D. J.; Schmeing, T. M.; Moore, P. B.; Steitz, T. A. EMBO J. 2001, 20, 4214-4221. (11) Schuwirth, B. S.; Borovinskaya, M. A.; Hau, C. W.; Zhang, W.; Vila-Sanjurjo, A.; Holton, J. M.; Cate, J. H. D. Science 2005, 310, 827-834. (11) Ogle, J. M.; Carter, A. P.; Ramakrishnan, V. Trends Biochem. Sci. 2003, 28, 259-266.
Sˇ poner et al. (12) Ra´zga, F.; Kocˇa, J.; Sˇ poner, J.; Leontis, N. B. Biophys. J. 2005, 88, 3466-3485. (13) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179-5197. (14) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Jr.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A.; Gaussian 03, revision C.02; Gaussian, Inc.: Pittsburgh, PA, 2004. (15) Becke, A. D. J. Chem. Phys. 1993, 98, 5648-5652. (16) (a) Lee, C.; Yang, W.; Parr, R. G. Phys. ReV. B 1988, 37, 785789. (b) Miehlich, B.; Savin, A.; Stoll, H.; Preuss, H. Chem. Phys. Lett. 1989, 157, 200-206. (17) Sˇ poner, J.; Jurecˇka, P.; Hobza, P. J. Am. Chem. Soc. 2004, 126, 10142-10151. (18) (a) Eichkorn, K.; Treutler, O.; Oehm, H.; Haeser, M.; Ahlrichs, R. Chem. Phys. Lett. 1995, 242, 652-660. (b) Weigend, F.; Haeser, M. Theor. Chem. Acc. 1997, 97, 331-340. (c) Weigend, F.; Haeser, M.; Patzelt, H.; Ahlrichs, R. Chem. Phys. Lett. 1998, 294, 143-152. (19) Jurecˇka, P.; Nachtigall, P.; Hobza, P. Phys. Chem. Chem. Phys. 2001, 3, 4578-4582. (20) (a) Giese, T. J.; Sherer, E. C.; Cramer, C. J.; York, D. M. J. Chem. Theory Comput. 2005, 1, 1275-1285. (b) Sherer, E. C.; York, D. M.; Cramer, C. J. J. Comput. Chem. 2003, 24, 57-67. (21) (a) Jurecˇka, P.; C ˇ erny´, J.; Hobza, P.; Salahub, D. R. J. Comput. Chem. 2007, 28, 555-569. (b) Grimme, S. J. Comput. Chem. 2004, 25, 1463-1473. (22) (a) Guerra, C. F.; Bickelhaupt, F. M. Angew. Chem., Int. Ed. 1999, 38, 2942-2945. (b) Mo, Y. R. J. Mol. Model. 2006, 12, 665-672. (c) Hesselmann, A.; Jansen, G.; Schutz, M. J. Am. Chem. Soc. 2006, 128, 11730-11731. (23) Boys, S. F.; Bernardi, F. Mol. Phys. 1970, 19, 553-566. (24) (a) Barone, V.; Cossi, M. J. Phys. Chem. A 1998, 102, 1995. (b) Cossi, M.; Rega, N.; Scalmani, G.; Barone, V. J. Comput. Chem. 2003, 24, 669-681. (c) Klamt, A.; Schuurmann, G. J. Chem. Soc., Perkin Trans. 2 1993, 799-805. (25) Case, D. A.; Darden, T. A.; Cheatham, T. E., III; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Merz, K. M.; Wang, B.; Pearlman, D. A.; Crowley, M.; Brozell, S.; Tsui, V.; Gohlke, H.; Mongan, J.; Hornak, V.; Cui, G.; Beroza, P.; Schafmeister, C.; Caldwell, J. W.; Ross, W. S.; Kollman, P. A. AMBER 8; University of California at San Francisco: San Francisco, 2004. (26) Cheatham, T. E., III; Cieplak, P.; Kollman, P. A. J. Biomol. Struct. Dyn. 1999, 16, 845-862. (27) Bayly, C. I.; Cieplak, P.; Cornell, W. D.; Kollman, P. A. J. Phys. Chem. 1993, 97, 10269-10280. (28) Guerra, C. F.; Bickelhaupt, F. M.; Snijders, J. G.; Baerends, E. J. J. Am. Chem. Soc. 2000, 122, 4117-4128. (29) Voss, N. R.; Gerstein, M.; Steitz, T. A.; Moore, P. B. J. Mol. Biol. 2006, 360, 893-906. (30) Ra´zga, F.; Zacharias, M.; Re´blova´, K.; Kocˇa, J.; Sˇponer, J. Structure 2006, 14, 825-835. (31) (a) Brandl, M.; Meyer, M.; Su¨hnel, J. J. Phys. Chem. A 2000, 104, 11177-11187. (b) Brandl, M.; Lindauer, K.; Meyer, M.; Su¨hnel, J. Theor. Chem. Acc. 1999, 101, 103-113. (32) Sarver, M.; Zirbel, C. L.; Stombaugh, J.; Mokdad, A.; Leontis, N. B. J. Math. Biol., in press. (33) The pairwise interaction energy between adenosine and a subsystem defined by a standard GC pair in the water-free type II A-minor model is -10.3 and -26.4 kcal/mol at the HF and MP2 levels, respectively (see Table 2). In the water-mediated model, the analogous pairwise interaction energy term was computed between adenosine and a subsystem composed of a standard GC pair plus the water molecule and amounts to -15.4 and -29.1 kcal/mol, respectively (see Table 2). (34) (a) Sˇ poner, J.; Mokdad, A.; Sˇ poner, J. E.; Sˇ pacˇkova´, N.; Leszczynski, J.; Leontis, N. B. J. Mol. Biol. 2003, 330, 967-978. (b) Oliva, R.; Cavallo, L.; Tramontano, A. Nucleic Acids Res. 2006, 34, 865-879.