Analogues of P and Z as Efficient Artificially Expanded Genetic

Jul 31, 2018 - Among the various bases studied, P3, P4, Z3, and Z5 are found to produce base pairs, which are about 2–15 kcal/mol more stable than t...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of South Dakota

B: Biophysics; Physical Chemistry of Biological Systems and Biomolecules

Analogues of P and Z as Efficient Artificially Expanded Genetic Information System Nihar Ranjan Jena, Priyabata Das, Bhagyashree Behera, and Phool Chand Mishra J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.8b04207 • Publication Date (Web): 31 Jul 2018 Downloaded from http://pubs.acs.org on August 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

The Journal of Physical Chemistry

1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analogues of P and Z as Efficient Artificially Expanded Genetic Information System N.R. Jena1*, P. Das1, B. Behera1, P.C. Mishra2 1

Discipline of Natural Sciences, Indian Institute of Information Technology, Design and Manufacturing, Khamaria, Jabalpur-482005, India. 2

Department of Physics, Banaras Hindu University, Varanasi-221005, India. *Corresponding Author’s Email Address: [email protected]

Abstract In order to artificially expand the genetic information system (AEGIS) and to realize artificial life, it is necessary to discover new functional DNA bases that can form stable duplex DNA and participate in error-free replication. It is recently proposed that the 2-amino-imidazo[1,2a]-1,3,5-triazin-4(8H)one (P) and 6-amino-5-nitro-2(1H)-pyridone (Z) would form base pair complex, which is more stable than that of the normal G:C base pair and would produce an unperturbed duplex DNA. Here, by using quantum chemical calculations in aqueous medium, it is shown that the P and Z molecules can be modified with the help of electron withdrawing and donating substituents mainly found in B-DNA to generate new bases that can produce even more stable base pairs. Among the various bases studied, P3, P4, Z3, and Z5 are found to produce base pairs, which are about 2-15 kcal/mol more stable than the P:Z base pair. It is further shown that these base pairs can be stacked onto the G:C and A:T base pairs to produce stable dimers. The consecutive stacking of these base pairs is found to yield even more stable dimers. The influence of charge penetration effects, and backbone atoms in stabilizing these dimers are also discussed. It is thus proposed that the P3, P4, Z3, and Z5 would form promiscuous artificial genetic information system and can be used for different biological applications. However, the evaluations of the dynamical effects of these bases in DNA containing several nucleotides and the efficacy of DNA polymerases to replicate DNA containing these bases would provide more insights.

ACS Paragon Plus Environment

The Journal of Physical Chemistry

2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1. Introduction Genetic information of life is encoded in different organisms with the help of four standard nucleotides, such as guanine (G), adenine (A), cytosine (C), and thymine (T). In order to artificially expand genetic information system (AEGIS)1 or to make artificial life possible, continuous attempts for the development of new nucleotides that can sustain all nuclear processes, such as replication, transcription, translation, etc are highly required. Although several non-standard nucleotides have been synthesized2-14, some of them do not mimic the natural genetic alphabets11-14. For example, Romesberg and co-workers11-13 have synthesized d5SICS and dNaM nucleotides and their base pair complex based on hydrophobic and geometric considerations. However, the d5SICS:dNaM complex does not make any hydrogen bonding interactions and is only stabilized by stacking interaction in DNA. Due to the non-Watson-Crick geometric alignment, this base pair induces significant backbone distortion and remain in intercalation mode (non-coplanar) in DNA. Therefore, such types of non-standard nucleotides may not produce stable and undistorted duplex DNA. Recently, two new nucleotides, such as 2-amino-imidazo[1,2-a]-1,3,5-triazin-4(8H)one (P) (Figure 1a) and 6-amino-5-nitro-2(1H)-pyridone (Z) (Figure 2a) that possess hydrogen bond donor and acceptor groups and are shape complementary to the standard DNA nucleotides have been synthesized by Benner and co-workers15,16. Subsequent X-ray crystallographic study17 has shown that the base pair complex involving these nucleotides is stabilized by three hydrogen bonding interactions and has retained the Watson-Crick geometric alignment. It is further reported that the duplex DNA containing these nucleotides can also be stabilized by the stacking interactions. Polymerases have been developed to allow P:Z base pair to participate in the polymerase chain reactions18. Other than these, a laboratory in vitro evolution (LIVE) experiment has shown that a DNA

ACS Paragon Plus Environment

Page 2 of 29

Page 3 of 29

The Journal of Physical Chemistry

3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

aptamer comprising of GACTZP sequence can bind to the breast and liver cancer cells1921

. These studies indicate that unnatural nucleotides can not only help in realizing artificial

life but also play an important role in controlling diseases. Although the occurrence of P:Z in the Duplex DNA has been reported17, the strengths of hydrogen bonding interactions and stacking interactions in different sequences are not known. Similarly, the dynamical and geometrical effects of P:Z base pair in different sequences are not fully understood. Further, as the X-ray structure does not account for the solution effects, the predicted structure may not resemble the physiological DNA. These problems are addressed here by employing various quantum chemical calculations and an implicit aqueous medium. Based on the results obtained here, several new bases are proposed, which may produce stable base pairs and stacked complexes in the duplex DNA. 2. Theory, and Computational Details The XYZ-coordinates of the P:Z base pair were extracted from the protein data bank (pdb 4XNO)17, which correspond to the crystallographic structure of a 16-mer DNA containing the sequence 5’-CTTATPPPZZZATAAG-3’. Subsequently, P and Z were isolated, hydrogenated, and optimized by using the ωB97X-D dispersion-corrected

density

functional theory (DFT)22-25 and 6-31+G* basis set in both the gas phase and aqueous medium. For aqueous medium the integral equation formalism of the polarized continuum model (IEFPCM)26,27 was used by considering the self-consistent reaction field (SCRF) theory. The optimized structures of P and Z were then modelled to create eight analogues of P (Figure 1) and five analogues of Z (Figure 2). Subsequently, different base pairs formed between P and Z and between their analogues were optimized in the gas phase and

ACS Paragon Plus Environment

The Journal of Physical Chemistry

4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

aqueous medium. Equation (1) was used to calculate the zero-point energy-corrected binding energy (BE) between A:B base pair complex. BE= EA:B –(EA+EB) ------------- (1) where EA:B is the zero-point energy-corrected total energy of the A:B complex and EA and EB are the zero-point energy-corrected total energies of isolated bases A and B respectively. In order to examine if these unnatural base pair complexes can form stable stacked dimers in DNA, certain dimers formed by the most stable base pair complexes were optimized in the aqueous medium by considering different sequences. Initially, the XYZ-coordinates of the P:Z/P:Z dimer (/ refers to stacking interaction) were extracted from the crystal structure (pdb 4XNO) and then it was subjected to the geometry optimization in the aqueous medium by removing backbone atoms. Subsequently, the optimized structure of the P:Z/P:Z dimer was modified to create various other minimized dimers. In earlier studies, stacking of purines opposite pyrimidines and vice versa were found to be more stable than those of purines opposite purines and pyrimidines opposite pyrimidines28. To crosscheck these, stacking interaction energies of A:B/C:D and A:B/D:C (A and C are purines, while B and D are pyrimidines) were calculated. The computed G:C/G:C and A:T/A:T dimers fall under the category of A:B/C:D, while G:C/C:G and A:T/T:A dimers fall under the category of A:B/D:C. It is found that the G:C/G:C and A:T/A:T

dimers are more stable than the

G:C/C:G and A:T/T:A dimers respectively by about 2 kcal/mol in agreement with earlier studies28. For this reason, A:B/C:D dimers are only considered in the present study. In order to evaluate the effect of backbone atoms on base stacking, few most stable purines and pyrimidines were stacked on each other in the single strand after including the sugarphosphate backbones. To do so, the XYZ-coordinates of the P/Z dimer were extracted from the experimental structure (pdb 4XNO)17 and subsequently, it was optimized. The optimized

ACS Paragon Plus Environment

Page 4 of 29

Page 5 of 29

The Journal of Physical Chemistry

5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

P/Z dimer was then modified to generate other dimers. Genuineness of all the optimized total energy minima in gas phase and aqueous medium was ensured by calculating all the real vibrational frequencies except for the stacked dimers that contain backbone atoms. The zero-point energy-corrected binding energies (BE) and stacking energies (SE) of the A:B/C:D dimers were calculated by using Equations (2) and (3) respectively. BE = EA:B/C:D – (EA +EB+EC +ED) --------------- (2) SE = EA:B/C:D – (EA:B + EC:D) -------------------- (3) where EA:B/C:D, is the zero-point energy-corrected total energy of the A:B/C:D dimer and EA:B and EC:D are the zero-point energy-corrected total energies of A:B and C:D base pairs respectively. A similar equation was used earlier by Suzuki et al.29,30, Sponer et al.31, and Bhatacharya and co-workers32 for the computation of stacking interaction energies of various DNA base pair complexes. To compare the stabilities of G:C and A:T base pairs in the gas phase with the results obtained earlier by using the RI-MP2/AUG-cc-pVXZ (X=D,T,Q) and RI-MP2/CBS (with CCSD(T) correction) levels of theory33-36, single-point energy (SPE)-calculations at the ωB97X-D/AUG-cc-pVDZ, MP2/6-311++G**, MP2/cc-pVDZ, MP2/AUG-cc-pVDZ, and CCD/6-31G** levels of theory were carried out by considering the optimized geometries obtained at the ωB97X-D/6-31+G* level of theory. It should be mentioned that the exact MP2 calculations carried out here are more expensive and accurate than that of the RI-MP2 method used earlier33-36. As CCSD(T) method was too expensive for the systems studied here, it was not considered in the present study. The comparative analyses discussed in the subsection 3.1 revealed that the results obtained here are fairly reliable. Due to this reason, SPE-calculations on all the base pairs in the aqueous medium were performed at the above

ACS Paragon Plus Environment

The Journal of Physical Chemistry

6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

levels of theory except at the CCD method. However, due to convergence issues, MP2 calculations could not be completed for stacked dimers in the aqueous medium. For these dimers, SPE calculations were carried out by using only the ωB97X-D/AUG-cc-pVDZ level of theory. Zero-point energy corrections obtained at the ωB97X-D/6-31+G* level of theory were considered to be valid for the SPE-calculations. Gaussian (G09) suit of program37 was used for all the computations and structures of all the complexes were visualized by using the GaussView 5.0 program38. 3. Results and Discussions 3.1 Reliability of the results Although accurate energetic data of different hydrogen bonded base pair structures including normal and abnormal base pairs in the gas phase33-36 and aqueous medium39-42 are available, computations of stacking interaction energies of natural and unnatural bases in different sequences in the aqueous medium are rare43-46. Fortunately, the gas phase stacking interaction energies of 10 standard dinucleotide steps computed by varying different dinucleotide steps and helical parameters are available31,32,35,43. These studies were mainly undertaken to rationalize dependency of stacking energy on different structural parameters such as Rise, Twist, Propeller Twist, Slide, Roll, etc in different sequences. Remarkably, ωB97X-D, which is relatively less computationally expensive compared to CCSD(T), MP2, etc, produced results in accordance with the crystal data32. In another study47 involving the measurement of the distance dependency of hydrogen bonding interactions, ωB97X-D method was found to provide reliable results compared to various DFT and DFT-D methods. These studies motivated us to use the ωB97X-D method for geometry optimizations of all the unnatural nucleobases, their base pairs, and stacked dimers.

ACS Paragon Plus Environment

Page 6 of 29

Page 7 of 29

The Journal of Physical Chemistry

7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A comparison of the binding energies of G:C and A:T base pairs obtained in the gas phase with the corresponding results obtained by the Hobza group33-36 reveals that the ωB97XD/AUG-cc-pVDZ method is capable of producing results similar to those obtained by computationally expensive RI-MP2/AUG-cc-pVQZ, and RI-MP2/CBS (D →T extrapolation) levels of theory (Table 1). Interestingly, the full MP2 and CCD results are significantly different than the RI-MP2 results and are similar to the GROMOS and OPLS force field results respectively48 (Table 1). It is also found that the results obtained at the MP2/6311++G**, and MP2/AUG-cc-pVDZ are similar (Table 1). As there is no recent experimental study available that has quantified the stabilities of isolated G:C and A:T base pairs in the gas phase, it is not possible to predict the accuracy of the methods used here with absolute certainty. The only mass spectroscopic study available so far has measured enthalpies of G:C and A:T pairs in high temperature49 and a direct comparison of computed energies with the measured enthalpies is not suitable as the experiment was performed in high temperature condition and might have contained several isomers or tautomers of G:C and A:T base pairs. Table 1: Gas phase binding energy data of G:C and A:T base pairs obtained by using different levels of theory. Base Pair ωB97X-D/6-31+G* ωB97X-D/AUG-cc-pVDZ MP2/6-311++G** MP2/cc-pVDZ MP2/AUG-cc-pVDZ CCD/6-31G** RI-MP2/AUG-cc-pVDZa RI-MP2/AUG-cc-pVTZa RI-MP2/AUG-cc-pVQZa RI-MP2/CBS (D→T)a RI-MP2/CBS with CCSD(T) correctiona (AMBER)[CHARMM]{GROMOS} |OPLS|b Experimentc

G:C -27.8 -27.7 -19.2 -21.9 -18.8 -22.2 -25.6 -27.0 -27.7 -27.5 -28.5

A:T -14.4 -15.1 -7.6 -9.9 -7.3 -9.6 -13.8 -14.7 -15.1 -15.0 -15.4

(-27.6)[-23.5]{-19.3}|-22.0| (-12.9)[-13.1]{-8.7}|-9.8| -21.0

ACS Paragon Plus Environment

-13.0

The Journal of Physical Chemistry

8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a

Data taken from Ref34. bData taken from Ref48. cData taken from Ref49.

Interestingly, the binding energy of G:C (-6.03 kcal/mol) and A:T (-2.34 kcal/mol) base pairs obtained at the MP2/AUG-cc-pVDZ level of theory in aqueous medium is similar to those obtained at the AMBER and GROMOS force field methods in solution (Table 2). The computed stability of G:C base pair also agrees reasonably well with the NMR predicted enthalpy change of G:C base pair (-5.8 kcal/mol) in dimethyl sulfoxide (DMSO)50. Although different methods produce different binding energies, the relative stabilities of base pairs follow a similar pattern across all the methods used. As we are intended to propose stable artificial nucleobases on the basis of relative stabilities for experimental evaluation, the results obtained here can be considered to be reasonable. To understand the contributions of different interaction energies to stacking energy, Sherrill and co-workers43 have recently used symmetry-adopted perturbation theory (SAPT) to decompose stacking energy of different dinucleotide steps into various energetic components. SAPT was also used to understand the nature of Π-Π stacking interactions in benzene and substituted benzene molecules46. These studies have clearly demonstrated that London dispersion forces are the most important attractive component of stacking energy in DNA followed by attractive electrostatic interactions. As the nature of stacking interactions is almost known, SAPT calculations were not performed here. 3.1 Structures of P, Z, and their analogues As the duplex DNA is stabilized by hydrogen bonding interactions between complementary bases and stacking interactions between interstrand and intrastrand nucleotides, any design of new nucleobases should be based on yielding greater strengths of these forces and to produce undistorted DNA structure. Keeping this in mind, several artificial nucleotides were

ACS Paragon Plus Environment

Page 8 of 29

Page 9 of 29

The Journal of Physical Chemistry

9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

synthesized by Benner and co-workers15-20,51. The first generation AEGIS nucleotides were not replicated by high fidelity DNA polymerases that led to the development of second generation AEGIS nucleotides by modifying the Hoogstein face of the first generation AEGIS nucleotides. Among these nucleotides, P and Z (obtained by replacing N5-CH3 by C5-NO2) were found to be promising. The unnatural base P is somewhat structurally similar to guanine (G) and contains both the purine and pyrimidine rings. The differences between these two bases arise due to the facts that (1) P does not contain H1 and (2) C5 and N7 atoms of G are replaced by the N5 and C7 atoms in P respectively (Figure 1).

Figure 1: Structures of P and its various analogues (P1-P8). Here R denotes the glycosidic bond, which was replaced by H during geometry optimization. The atomic numbering scheme adopted for P and its analogues is also illustrated.

Although P does not contain any substituents, it is expected that the additions of suitable substituents to its pyrimidine ring may enhance its stacking interaction without disturbing its hydrogen bonding strength. Hunter and Sanders52 have shown that the additions of electronwithdrawing substituents (e.g. NO2) can enhance Π-stacking interaction between substituted and unsubstituted benzene stacked dimer by withdrawing Π-electron density from the substituted benzene, thereby reducing its electrostatic repulsion with the un-substituted benzene. Interestingly, Wheeler and Houk53 have shown that the additions of both the electron-withdrawing and electron-donating groups (e.g. NH2) to one of the benzene

ACS Paragon Plus Environment

The Journal of Physical Chemistry

10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

molecule in the benzene dimer can enhance its Π-stacking interaction due to (1) the direct electrostatic interaction between the substituents and the un-substituted benzene and (2) the change due to altered polarity of the linking phenyl-substituent σ-bond. Recently, it is shown that in addition to altered electrostatic interactions, substituents can also attenuate dispersion interactions that ultimately help in enhancing the stacking interactions54,55. It has also been demonstrated that both the electron withdrawing and donating groups can enhance Πstacking interactions by providing attractive electrostatic interactions, which arise due to the overlapping of electron densities of two monomers, thereby helping positive nuclei to interact with the diffuse electron densities (charge penetration effect)55. These substituent effects have also been observed in various experimental studies56,57. Due to these reasons, various electron withdrawing and donating substituents that occur naturally in DNA, such as NH2 (found in A), CH3 (found in T), etc were added to the pyrimidine ring of P mainly by replacing the C7H7 group to generate 8 analogues of P (P1P8). As the presence of unnatural NO2 group in Z helped P:Z base pair to be recognized by DNA polymerases15-20,51, it was also added to P. In some of these analogues, (P3, P4, and P8), the unusual C4N4 single bond of P was replaced by the natural C4C5 double bond of a purine. The detailed structures of various analogues of P are illustrated in Figure 1. The same substituents were also added to the C5 and C6 sites of Z to generate its 5 analogues (Z1-Z5) (Figure 2). In all the analogues of Z, the C-glycosidic bond was replaced by the natural Nglycosidic bond of pyrimidines. The structures of Z and its analogues are depicted in Figure 2.

ACS Paragon Plus Environment

Page 10 of 29

Page 11 of 29

The Journal of Physical Chemistry

11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: Structures of Z and its analogues (Z1-Z5). Here R denotes the glycosidic bond, which was replaced by H during geometry optimization. The atomic numbering scheme adopted for Z and its analogues is also illustrated.

3.2 Hydrogen bonding interactions between different base pairs The base pair interactions between P and Z and between their analogues produced fifty four complexes. The binding energies of these complexes obtained in the aqueous medium are presented in Table 2. The binding energies of G:C and A:T pairs are also provided in this Table for comparison. The corresponding gas phase binding energies are presented in Table S1 (Supporting Information). The optimized structures of some of the highly stable base pairs are illustrated in Figure 3. Remaining base pairs are shown in Figures. S1-S9 (Supporting Information). From Table 2, it is clear that the P:Z base pair is about 0.4 - 0.7 kcal/mol more stable than the G:C base pair as obtained at different levels of theory. Recently, in a DNA melting study58, the Gibbs free energy difference (∆∆G37) between the P:Z and G:C base pair complexes in the GCCAGTTAA sequence (G and C were replaced by P and Z respectively) was found to be -0.85 kcal/mol. Hence the experimental result qualitatively matches with the computed results58, particularly with that obtained at the ωB97X-D/AUG-cc-pVDZ level of theory (Table 2). It was earlier presumed that the N1(P)--H3(Z) hydrogen bond of P:Z would be a low barrier hydrogen bond and hence the H3 proton of Z may migrate to the N1 site of the P58. However, the optimized geometries of G:C (Figure 3a) and P:Z complexes (Figure 3b) suggest that the H1(G)--N3(C) (~1.90 Å) and N1(P)--H3(Z) (~1.90 Å) hydrogen bonds would be of same strength. It is further found that the barrier energy required for the transfer of H3 proton from Z to P lies between 8 - 13 kcal/mol, which is significantly high to facilitate the proton transfer reaction. Hence, Z may not be deprotonated in DNA. Although there is no experimental data available to compare stabilities of P:Z and A:T complexes, computed results presented in Table 2 show that the former complex would be about 4 - 6 kcal/mol

ACS Paragon Plus Environment

The Journal of Physical Chemistry

12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

more stable than the latter complex.

Figure 3: Optimized structures of some important base pair complexes obtained in the aqueous medium. The hydrogen bond distances (Å) are shown by dotted lines.

Among the different analogues of P, and Z, interactions of P3 and P4 with Z3 (Figure 3c,d) produced the most stable planar base pair complexes that are about 2 - 3 kcal/mol more stable than the P:Z complex (Table 2). Similarly, among the most stable non-planar complexes formed, interactions of P3 and P4 with Z5 produced complexes (Figure 3d,e) that are about 13-15 kcal/mol more stable than the P:Z complex (Table 2). The highest stability of P3:Z5 and P4:Z5 can be attributed to the intramolecular hydrogen bonds in P4 and Z5 and favorable dipole-dipole interactions. Based on these results, it can be proposed that the replacements of

ACS Paragon Plus Environment

Page 12 of 29

Page 13 of 29

The Journal of Physical Chemistry

13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

P by P3 and P4 and Z by Z3 and Z5 would enhance base pair interactions between P and Z appreciably. Table 2: Binding energies (kcal/mol) of different natural and unnatural base pair complexes obtained at different levels of theory in the aqueous medium. Base pair

Binding Energy

ωB97XD/6-31+G* G:C A:T P:Z P1:Z P2:Z P3:Z P4:Z P5:Z P6:Z P7:Z P8:Z P:Z1 P1:Z1 P2:Z1 P3:Z1 P4:Z1 P5:Z1 P6:Z1 P7:Z1 P8:Z1 P:Z2 P1:Z2 P2:Z2 P3:Z2 P4:Z2 P5:Z2 P6:Z2 P7:Z2 P8:Z2 P:Z3 P1:Z3 P2:Z3 P3:Z3 P4:Z3 P5:Z3 P6:Z3 P7:Z3

-14.91 -9.30 -15.47 -15.55 -15.20 -17.44 -17.39 -15.11 -14.20 -14.69 -16.67 -14.80 -14.89 -14.53 -16.22 -16.48 -14.34 -13.74 -14.06 -15.61 -14.34 -14.39 -14.02 -15.65 -15.53 -13.88 -13.24 -13.65 -15.31 -16.37 -16.36 -16.05 -18.37 -18.23 -15.73 -14.71 -15.38

ωB97XD/AUG-ccpVDZ -15.80 -10.16 -16.45 -16.40 -16.17 -18.58 -18.48 -15.99 -15.11 -15.06 -17.72 -15.85 -15.16 -15.58 -16.45 -17.61 -15.31 -15.22 -14.94 -16.75 -15.24 -14.52 -14.92 -15.68 -16.80 -14.73 -13.17 -12.70 -16.26 -17.28 -17.14 -16.96 -19.44 -19.24 -16.55 -15.99 -16.07

MP2/6311++G**

MP2/ccpVDZ

MP2/AUG -cc-pVDZ

-6.27 -2.53 -6.63 -6.70 -6.60 -8.18 -7.95 -6.47 -5.99 -6.17 -7.60 -6.13 -5.98 -6.10 -6.96 -7.44 -6.01 -5.78 -5.73 -6.92 -5.79 -5.60 -5.70 -6.48 -6.77 5.56 -4.55 -4.49 -6.54 -7.47 -7.43 -7.49 -9.10 -8.80 -7.13 -6.55 -6.77

-10.28 -5.53 -10.74 -10.59 -10.61 -12.44 -12.11 -10.19 -9.48 -10.21 -11.66 -9.99 -9.75 -9.86 -11.09 -11.31 -9.50 -8.89 -9.12 -10.74 -9.79 -9.48 -9.61 -10.76 -10.59 -9.20 -7.93 -8.39 -10.54 -11.45 -11.20 -11.37 -13.20 -12.79 -10.70 -9.72 -10.25

-6.03 -2.34 -6.44 -6.34 -6.46 -7.94 -7.70 -6.20 -5.65 -5.65 -7.37 -5.98 -5.47 -6.00 -6.44 -7.20 -5.77 -5.61 -5.35 -6.70 -5.58 -5.04 -5.54 -5.88 -6.64 -5.28 -4.14 -3.86 -6.31 -7.24 -7.03 -7.32 -8.80 -8.48 -6.81 -6.28 -6.32

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Page 14 of 29

14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

P8:Z3 P:Z4 P1:Z4 P2:Z4 P3:Z4 P4:Z4 P5:Z4 P6:Z4 P7:Z4 P8:Z4 P:Z5 P1:Z5 P2:Z5 P3:Z5 P4:Z5 P5:Z5 P6:Z5 P7:Z5 P8:Z5

-17.27 -16.05 -16.13 -15.68 -17.37 -17.63 -15.30 -14.22 -14.89 -16.56 -26.62 -26.69 -26.39 -28.17 -28.18 -26.00 -25.17 -26.00 -27.46

-18.26 -16.81 -16.78 -16.45 -18.30 -18.49 -14.85 -15.37 -24.73 -17.41 -27.37 -27.32 -27.14 -29.07 -29.03 -26.66 -26.31 -26.29 -28.29

-8.32 -7.28 -7.31 -7.25 -8.20 -8.40 -5.76 -6.28 -15.71 -7.76 -20.52 -20.59 -20.59 -21.69 -21.61 -20.15 -19.79 -19.95 -21.24

-12.18 -11.08 -10.90 -10.95 -12.14 -12.22 -9.14 -9.28 -19.05 -11.47 -24.38 -24.26 -24.36 -25.65 -25.46 -23.62 -22.86 -23.29 -25.05

-7.99 -6.89 -6.76 -6.91 -7.76 -7.93 -5.32 5.89 -15.12 -6.10 -20.60 -20.51 -20.72 -21.70 -21.59 -20.14 -19.83 -19.76 -21.28

3.3 Stacking interactions and stabilities of different stacked dimers As the interactions of P3 and P4 with Z5 produced the most stable and non-planar base pair complexes, it is essential to unravel if these complexes can generate stable and unperturbed duplex DNA. It is also necessary to understand the behaviour of these base pairs in different sequences. To understand these, P3:Z5 and P4:Z5 complexes were stacked onto G:C and A:T base pairs. To compare the structural and energetic details of these dimers with those of the G:C/G:C, G:C/A:T, P:Z/G:C, P:Z/A:T, P3:Z3/G:C, P3:Z3/A:T, P4:Z3/G:C, and P4:Z3/A:T planar dimers, geometries of all these dimers were optimized in the aqueous medium. The optimized geometries of various base pair complexes stacked onto the G:C and A:T base pair complexes are depicted in Figures 4 and 5 respectively. The binding and stacking energies of these dimers are presented in Table 3. From this Table it is evident that the stabilities of these dimers follow the order G:C/G:C < P:Z/G:C < P3:Z3/G:C < P4:Z3/G:C < P3:Z5/G:C < P4:Z5:G:C. The highest stability of the P4:Z5/G:C dimer is linked with the stable hydrogen bonding interactions between P4 and Z5 (Table 2).

ACS Paragon Plus Environment

Page 15 of 29

The Journal of Physical Chemistry

15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: Optimized structures of various stacked base pair dimers involving G:C base pair obtained in the aqueous medium. The hydrogen bond distances are shown in Å.

As can be seen from Figure 4, the stacking patterns of P3:Z3/G:C and P4:Z3/G:C are similar to those of the G:C/G:C and P:Z/G:C dimers. However, the N7-NH2 group of P4 can make an intrastrand hydrogen bond with the C4-NH2 group of the C (Figure 4d). Due to this reason, the stacking energy of the P4:Z3/G:C dimer is found to be more negative than those of the G:C/G:C, P:Z/G:C, and P3:Z3/G:C dimers. It is further found that the bulky N6-NH2 group of Z5 remains in the intercalation mode and pushes G to move away in the Y-direction (Yaxis is defined along the C8-C6 direction)59. This movement is evident from Figure 4f, where the N7-NH2 group of P4 is hydrogen bonded with the O6 of G of the other strand (instead of N4 of C of the same strand as found in P4:Z3/G:C, Figure 4d). However, this interstrand hydrogen bond is lost in the P3:Z5/G:C dimer due to both the sliding and rotation of G

ACS Paragon Plus Environment

The Journal of Physical Chemistry

16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(Figure 4e). This dynamics however, helps Z5 to stack onto the pyrimidine ring of G. In spite of this, the loss of interstrand hydrogen bond slightly diminishes its stacking energy (Table 3). As both the P4:Z3/G:C and

P4:Z5/G:C dimers experience intermolecular hydrogen

bonds, their stacking energies are found to be analogous (Table 3). Table 3: Binding energies (kcal/mol) and stacking energies (kcal/mol) of different natural and unnatural base pair dimers obtained in the aqueous medium.

Dimers

Binding energy ωB97XωB97XD/6D/AUG-cc31+G* pVDZ -47.09 -49.16

Stacking Energy ωB97X-D/6-31+G* ωB97X-D/AUGcc-pVDZ

-17.27 (-17.3)a [-15.69]b -17.56 {-13.75}c1 |-15.22|c2 P:Z/G:C -48.39 -50.30 -18.00 -18.05 P3:Z3/G:C -52.75 -54.29 -19.47 -19.04 P4:Z3/G:C -53.96 -55.28 -20.83 -20.24 P3:Z5/G:C -63.26 -64.40 -20.18 -19.53 P4:Z5/G:C -64.09 -65.34 -20.99 -20.51 G:C/A:T -40.24 -16.53 -42.51 -16.55 A:T/A:T -35.13 -37.65 -16.53 (-12.80)a [-11.92]b -17.33 {-11.84}c1 |-14.84|c2 P:Z/A:T -42.38 -44.62 -17.61 -18.01 P3:Z3/A:T -46.27 -48.24 -18.59 -18.64 P4:Z3/A:T -48.55 -50.51 -21.03 -21.11 P3:Z5/A:T -55.69 -57.54 -18.22 -18.31 P4:Z5/A:T -58.21 -60.06 -20.72 -20.87 a 31 Data from Ref obtained by using MP2/AUG-cc-pVDZ/AUG-cc-pVTZ extrapolations plus ∆CCSD(T)/6-31G*(0.25) corrections. The sum of pairwise interactions was corrected by a four body MP2/Aug-cc-pVDZ correction. G:C/G:C

b

Data from Ref43 obtained by using SAPT0/jaDZ level of theory.

c1

Data from Ref35 obtained by using RI-DFT-D/TPSS/TZVP level of theory.

c2

Data from Ref35 obtained by using CBS(T) method. The CBS(T) energy was calculated as the sum of CBS(T) inter- and intrastrand stacking contributions and many-body term evaluated at the MP2/AUG-cc-pVDZ level of theory.

From Figure 5, it is clear that the base stacking patterns of G:C, P:Z, P3:Z3, and P4:Z3 base pairs stacked onto the A:T base pair are similar. However, the binding energy of the P4:Z3/A:T is more negative than that of G:C/A:T, P:Z/A:T, and P3:Z3/A:T dimers (Table 3).

ACS Paragon Plus Environment

Page 16 of 29

Page 17 of 29

The Journal of Physical Chemistry

17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

This is because of the tight base pair interactions formed between P4 and Z3. Similarly, the stacking patterns of P3:Z5/A:T, and P4/Z5/A:T are analogous. Interestingly, unlike in the P4:Z5/G:C dimer, the bulky N6-NH2 group of Z5 in the P4:Z5/A:T dimer does not push A away from the strand and the N7H2 group of P makes an intrastrand hydrogen bond with O4 of T (Figure 5f) like the P4:Z3/A:T dimer (Fig. 5d). As a result, the stacking energies of these two dimers become identical (Table 3).

Figure 5: Optimized structures of various stacked base pair dimers involving A:T base pair obtained in the aqueous medium. The hydrogen bond distances are shown in Å.

Notably, the difference in stacking energies between the G:C/G:C and A:T/A:T dimers in the aqueous medium is found to lie between 0.23 – 0.74 (Table 3). This difference in stacking energy calculated at the RI-DFT-D/TPSS/TZVP level of theory in the gas phase by considering different geometries obtained from a 20ns molecular dynamics-simulation was found to lie between 1.7 - 2.6 kcal/mol35. However, the use of SAPT43 and CCSD(T)31 methods in the gas phase increased this energy gap significantly. Interestingly, when the

ACS Paragon Plus Environment

The Journal of Physical Chemistry

18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

stacking energy was calculated by considering energy components involved in the intra- and interstrand stacking at the CBS(T) method and by including a many-body term evaluated at the MP2/AUG-cc-pVDZ level of theory, the stacking energy difference between these two dimers was found to be 0.4 kcal/mol. Even the A:T/A:T dimer was found to be slightly more stable than the G:C/G:C dimer at the RI-DFT-D/TPSS/TZVP, and RI-DFT-D/TPSS/6311++G(3df,3pd) levels of theory35. This implies that the effects of dispersion and aqueous medium may reduce the stacking energy difference between the G:C/G:C and A:T/A:T dimers. 3.3 Effects of consecutive base pairs involving P, Z, and their analogues on DNA It was earlier suggested that the occurrence of consecutive P:Z base pairs will provide greater stability to DNA compared to that of the single P:Z base pair17,58. To verify this and to examine the effects of consecutive P3:Z3, P4:Z3, P3:Z5, and P4:Z5 base pair dimers in DNA, P:Z/P:Z, P3:Z3/P3:Z3, P4:Z3/P4:Z3, P3:Z5/P3:Z5, and P4:Z5/P4:Z5 dimers were optimized in the aqueous medium. The optimized structures of these dimers are illustrated in Figure 6. The binding and stacking energies of these dimers are presented in Table 4. As evident from this Table, the consecutive base pair stacking of P:Z enhanced its binding energy by about 2 kcal/mol with respect to the P:Z/G:C dimer and about 8 kcal/mol with respect to the P:Z/A:T dimer in agreement with the experimental finding17. Similarly, the stacking energy of the P:Z/P:Z dimer is found to be more negative than those of the P:Z/G:C and P:Z/A:T dimers (Tables 3 and 4). Interestingly, P4:Z3/P4:Z3, P3:Z5/P3:Z5, and P4:Z5/P4:Z5 dimers are found to be more stable than those of the G:C/G:C, and P:Z/P:Z dimers and their stabilities follow the order G:C/G:C < P:Z/P:Z < P3:Z3/P3:Z3 < P4:Z3/P4:Z3 < P3:Z5/P3:Z5 < P4:Z5/P4:Z5 (Table 4). The highest stability of P4:Z5/P4:Z5 dimer can be ascribed to the strong base pair

ACS Paragon Plus Environment

Page 18 of 29

Page 19 of 29

The Journal of Physical Chemistry

19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

interactions between P4 and Z5 (Table 2). However, the stacking interaction energy of these dimers follow the order P3:Z5/P3:Z5 < P4:Z5/P4:Z5 < G:C/G:C < P3:Z3/P3:Z3 < P4:Z3/P4:Z3