Nucleic Acid G-quartets - American Chemical Society

May 2, 2011 - A.K. Jissy, U.P.M. Ashik, and Ayan Datta*. School of Chemistry, Indian Institute of Science Education and Research Thiruvananthapuram, C...
0 downloads 0 Views 8MB Size
ARTICLE pubs.acs.org/JPCC

Nucleic Acid G-quartets: Insights into Diverse Patterns and Optical Properties A.K. Jissy, U.P.M. Ashik, and Ayan Datta* School of Chemistry, Indian Institute of Science Education and Research Thiruvananthapuram, CET Campus, Thiruvananthapuram-695016, Kerala (India)

bS Supporting Information ABSTRACT: Structures of various conformers of G-quartets (G4) with different types of hydrogen bonding patterns have been investigated through various levels of Density Functional Theory (DFT). Their structure and stability has been compared with the diad (G2), triad (G3), pentad (G5) and hexad (G6) of guanine. The calculations show that G4 has the highest stabilization through hydrogen-bond interaction which explains the tendency of guanine rich strands in the telomeric region to favor the formation of the quadruplex structure. We have also performed calculations for Liþ, Naþ, Kþ, Be2þ, Mg2þ and Ca2þ complexes of G4. Calculations show that for an isolated quartet, the metal ion with the smallest ionic radius in their respective groups (IA and IIA) form more stable complexes. Other properties such as the HOMOLUMO gap and polarizability have also been analyzed. The variation in the polarizability has been studied with respect to the movement of cations along the central cavity of the quartet. Such movement leads to a large anisotropy of polarization and hence the refractive index (η) thereby creating optical birefringence which have potential applications in biomolecular imaging.

’ INTRODUCTION Any eukaryotic chromosome must contain the replication origins, the centromeres, and the telomeres in order to replicate and segregate correctly. Telomeres, the ends of the eukaryotic chromosomes, are composed of tandemly repeated DNA sequences with a G-rich strand oriented 50 f 30 toward the end of the chromosome.1 Chromosomal ends get protected from the attack by exonucleases, fusion as well as degradation by specific proteins bound to the telomeric region.2 The telomeric region helps in proper replication of the DNA. Each cell division results in telomere shortening by an average of 50100 bases.3 So, on multiple cell divisions, the cell’s telomeres will become too short to protect the chromosomal ends, leading to senescence or even cell destruction by apoptosis. However, telomere shortening is prevented by a proteinRNA enzyme called telomere terminal transferase, or telomerase.4 The human gene expressing the telomerase protein and the telomerase-associated RNA are active in the germ-line cells, embryonic cells, and stem cells, but are not activated in most human somatic cells. Inhibiting telomerase, which is expressed in over 8085% of the rapidly multiplying cancer cells and maintains the telomeric length resulting in multiple cell divisions leading to the tumor,57 has garnered considerable attention as an anticancer approach. Human telomeric DNA consists of repeats of the TTAGGG sequence.8 The single G-rich strand which protrudes beyond the double stranded DNA helix can form a four stranded structure known as quadruplex9 consisting of parallel arrays of guanine held together by Hoogsteen base pairs.10 The folding r 2011 American Chemical Society

and the stability of the quadruplex structures depends upon the presence of specific metal cations.1119 The method of stabilizing the telomeric ends as G-quadruplex structures using specific small molecules which destabilize the maintenance of the telomere length in cancer cells is used in therapeutic strategies.2022 Thus, currently G-quadruplex is considered as a potential target for cancer and also antiaging therapy.23,24 In addition, several proteins specifically bind to G-quadruplex or perform selective enzymatic activity on G-quadruplex substrates.2528 The G-quartet (Figure. 1) can stabilize various intermolecular or intramolecular quadruplex structures.29 There are more than 3,00,000 estimated sites that can potentially form G4-DNA in the human genome.30 They occur in repetitive genomic sequences in the telomeres as well as in the rDNA, immunoglobulin heavychain switch regions,31 and also in the control regions of protooncogenes.32 The four-stranded G-quadruplex also seems to be ideal for ordering material over nanometer-to-micrometer distances.11 Self-assembly of guanine is used as a powerful way to arrange molecules into specific patterns and to construct nanoscale devices.33 G-quartets are useful building blocks to build molecular G-wires,3437 frayed wires,38,39 and guanine-mediated synapsis40 which could be useful in nanoelectronics,41 nanotechnology, and biosensor development. G-quadruplex aptamers, apart from having potential as therapeutics and diagnostics, have Received: March 14, 2011 Revised: April 19, 2011 Published: May 02, 2011 12530

dx.doi.org/10.1021/jp202401b | J. Phys. Chem. C 2011, 115, 12530–12546

The Journal of Physical Chemistry C

Figure 1. Standard G-quartet with reference numbering.

applications in bioanalytical chemistry42 and have also served as a prototype for biosensors.43 G-quadruplexes have been investigated experimentally by using X-ray crystallography,44,45 NMR,4548 gel electrophoresis,8 and scanning tunneling microscopy (STM on surfaces).49,50 Spectroscopic methods such as mass spectrometry51 and circular dichroism (CD) spectroscopy52 are also widely used to characterize the structures of DNA G-quadruplexes.53,54 Theoretical studies have also been previously carried out on the structure of the G-quadruplex using quantum mechanics (QM),11,5561 and molecular mechanics (MM).62 B3LYP and BLYP calculations have led to similar results for the G-quartets. Both methods have been shown to be suitable for bioorganic and bioinorganic systems with H-bonds.63 Molecular modeling64 and molecular dynamic simulations6568 have been extesively used for studying the 3D structures,6972 folding pathways,73 and different properties74,75 of quadruplexes as well as the binding of various ligands to these oligomers.76,77 We have carried out high-level DFT calculations (with and without dispersion corrections) to study the stability of free-standing G-quartets with and without the metal ions and the role of metal ions in inducing structural diversity and change in H-bonding patterns of the G-quartet. These structures are further investigated for their potential applications in advanced optoelectronic properties such as optical birefringence.78

’ COMPUTATIONAL METHODS The initial structures of the diad and the quartet were generated from the coordinates obtained from the Protein Data Bank79 (1D8080- G-diad; 2KBP81 - G-quartet with no bifurcated bonds; 1LVS82 - G-quartet with two bifurcated bond pairs). The PDB database contains the crystal structure of the oligonucleotide from which we extracted the base pair and the quartet. The sugarphosphate backbones and other base pairs are removed. The bond valencies which remain unsatisfied on their removal were terminated with a hydrogen atom. Coordinates of the triad, pentad, and the hexad were generated, based on the purine triad (2AFP),83 pentad (1JJP),84 and the hexad (1EEG)85 found in the PDB database. The structure for the cyclic pentad which is nonnative was generated by the suitable placement of five guanines together. Several conformers were generated, and the minimum energy conformer is reported. All structures are optimized by density functional theory (DFT).86 The DFT calculations are carried out with nonlocal hybrid three-parameter

ARTICLE

LeeYangParr (B3LYP)87,88 level of theory with the 6-31G(d)8992 basis set as implemented in the Gaussian 0993 program. Meng et al.89 have studied the structure and stability of G4complexes at B3LYP level using 6-31G(d) and 6-31G(d,p) basis sets. They obtained similar interaction energies for the G4 complex with a difference of only 0.3 kcal/mol, which confirmed that adding polarization functions changes the interaction energy of these systems only negligibly and the B3LYP/6-31G(d) method is suitable for studying G4-complexes. Calculations are also done using M05-2X94 and M06-2X95 theory with 6-31þG(d,p)96,97 basis set. M05-2X and M06-2X functionals effectively incorporate the long-range dispersion forces.97 These functionals have been shown to give reliable results for DNA base pairs, which are comparable to those obtained from MP2 calculations.97 RI-DFTD98 calculations at BLYP/TZVPP99 level were also performed to account for dispersion interactions using TURBOMOLE 6.0.100 Meyer et al. have shown that both B3LYP and BLYP methods lead to similar results for the guanine and uracil quartets.63 The geometrical parameters, total energies and the low frequencies are reported in the Supporting Information (SI). The standard convergence criterion was applied for the optimizations: maximum residue force on atoms