2198
J. Phys. Chem. B 2008, 112, 2198-2206
Cylindrical Projection of Electrostatic Potential and Image Analysis Tools for Damaged DNA: The Substitution of Thymine with Thymine Glycol Maciej Haranczyk,*,†,‡,§ Giovanni Lupica,| Iwona Da¸ bkowska,†,‡,# and Maciej Gutowski*,†,‡,+ Department of Chemistry, UniVersity of Gdan´ sk, Sobieskiego 18, 80-952 Gdan´ sk, Poland, Chemical Sciences DiVision, Pacific Northwest National Laboratory, Richland, Washington 99352, Department of Information Studies, UniVersity of Sheffield, Sheffield S1 4DP, United Kingdom, Department of Electronic and Electrical Engineering, UniVeristy of Sheffield, S1 3JD, United Kingdom, Institut fu¨r Chemie und BiochemiesPhysikalische und Theoretische Chemie, Freie UniVersita¨t Berlin, Takustrasse 3, D-14195, Berlin, Germany, and Chemistry-School of Engineering and Physical Sciences, Heriot-Watt UniVersity, Edinburgh EH14 4AS, United Kingdom ReceiVed: October 5, 2007; In Final Form: NoVember 21, 2007
Changes of electrostatic potential around the DNA molecule resulting from chemical modifications of nucleotides may play a role in enzymatic recognition of damaged sites. Effects of chemical modifications of nucleotides on the structure of DNA have been characterized through electronic structure computations. Quantum mechanical structural optimizations of fragments of five pairs of nucleotides with thymine or thymine glycol were performed at the density functional level of theory with a B3LYP exchange-correlation functional and 6-31G(d,p) basis sets. The electrostatic potential (EP) around DNA fragments was projected on a cylindrical surface around the double helix. The 2D maps of EP of intact and damaged DNA fragments were compared using image analysis methods to identify and measure modifications of the EP that result from the occurrence of thymine glycol. It was found that distortions of phosphate groups and displacements of the accompanying countercations by up to ∼0.5 Å along the axis of DNA are clearly reflected in the EP maps. Modifications of the EP in the major groove of DNA near the damaged site are also reported.
1. Introduction DNA in living cells is continuously exposed to a number of environmental agents, which include UV light and ionizing radiation as well as various chemical species. In many cases, the mechanism of action of these agents involves generation of free radicals that attack DNA and produce a variety of lesions, including sugar and base modifications, strand breaks, and DNA-protein cross-links.1 Some reactive oxygen species (ROS), e.g., O2-, ‚OH, are also generated endogenously during cellular aerobic metabolism, and the damage they cause may be an important factor in aging and age-dependent diseases, including cancer.2-5 In other cases, oxygen radicals may be introduced exogenously to the living cells, e.g., with tobacco smoke. They have been recognized as a major etiological factor for cancers of the upper aerodigestive tract.6 The reaction of a thymine base (Figure 1) with ROS leads to the formation of thymine glycol (5,6-dihydro-5,6-dihydroxythymine or Tg; Figure 2), which is a major type of oxidative DNA damage.7 Although this damage causes mutations at a low frequency of appearance,8 it effectively blocks DNA replication9 and must be repaired by the excision repair pathway.10 Ames * To whom correspondence should be addressed. Email: m.gutowski@ hw.ac.uk;
[email protected]. † Department of Chemistry, University of Gdan ´ sk. ‡ Chemical Sciences Division, Pacific Northwest National Laboratory. § Department of Information Studies, University of Sheffield. | Department of Electronic and Electrical Engineering, Univeristy of Sheffield. # Institut fu ¨ r Chemie und BiochemiesPhysikalische und Theoretische Chemie, Freie Universita¨t Berlin. + Chemistry-School of Engineering and Physical Sciences, Heriot-Watt University.
Figure 1. Labeling of atoms in thymine.
and co-workers have shown that an average human cell repairs about 320 thymine glycol sites/day.11 Since 10-20% of the damage to DNA by ionizing radiation is thymine base oxidation and fragmentation, and these products are also produced by oxidative stress, the effects of damaged thymines on the structure, stability, dynamics, and interactions of DNA are of great interest. There are four stereoismers of thymine glycol, i.e., (5R,6S), (5R,6R), (5S,6R), and (5S,6S). Additionally, one can distinguish between the 5R cis-trans pair [(5R,6S) and (5R,6R)] and the 5S cis-trans pair [(5S,6R) and (5S,6S)], due to epimerization at the C6 position.12 It has been reported that the 5R and 5S isomers are formed in equal amounts in γ-irradiated DNA,13 but the oxidation of thymidine or thymidine-containing oligonucleotides preferentially yields the (5R,6S)-thymine glycol.14 The same preference [the ratio of the (5R,6S) and (5S,6R) isomers was 6:1] was observed in a study on the chemical synthesis of thymine glycol-containing oligonucleotides using a phosphoramidite building block15 and a large-scale preparation
10.1021/jp709751w CCC: $40.75 © 2008 American Chemical Society Published on Web 01/29/2008
Analysis Tools for Damaged DNA
J. Phys. Chem. B, Vol. 112, No. 7, 2008 2199
Figure 2. Geometries of three rotamers of thymine glycol.
Figure 3. Three rotamers of thymine glycol connected to a sugar unit.
Figure 4. Purine nucleic acid bases with their 8-oxo-counterparts.
was required to obtain the (5S,6R)-thymidine glycol in a yield sufficient for its incorporation into oligonucleotides.16 Theoretical investigations have also been undertaken to predict stability of different isomers of thymine glycol.17 It was found that the diasteroisomer (5R,6S) is the most stable. This diastereoisomer can exist in at least 3 rotameric forms, Tg1, Tg2, and Tg3, which differ by orientations of OH groups and intramolecular hydrogen bond patterns (Figure 2). Our B3LYP/6-31++G(d,p) calculations suggest that Tg2 and Tg3 are 5.14 and 5.12 kcal/mol less stable than Tg1.18 However, when thymine glycol is incorporated into DNA, new hydrogen-bonding patterns may emerge making the less stable rotamers becoming favorable. For example, our B3LYP/ 6-31++G(d,p) studies on the relative energies of Tg1,Tg2, and Tg3 connected to a sugar unit (denoted sTgx, where x ) 1-3, Figure 3), suggest that sTg3 is the most stable, and sTg2 and sTg1 are less stable by 0.3 and 1.54 kcal/mol, respectively.18 Biochemical studies on translesion replication19 and base excision repair20 revealed that the stereochemistry of thymine glycol is recognized by proteins responsible for these biological processes.
Despite the fact that damage detection and repair is a complicated process, the idea of damage detection is fairly simple: the enzyme has to locate the damaged sites based on features that make those sites different from intact fragments of DNA. The DNA-enzyme interaction is determined by hydrogen bonds, electrostatic interactions, and dispersion forces. The two first have similar origins and result from the interaction of effective atomic charges of molecule A with the polarizable charge distribution of molecule B. The dispersion interaction, however, can be associated with the interaction of instantaneous multipoles of A and B. The electrostatic potential of DNA is screened by solvating water molecules. However, the detection and repair enzymes bind to the DNA polymer and then move along the strand. Therefore, the effect of solvent on the detection of electrostatic potential is limited. In our recent works, we analyzed electrostatic potential around an intact DNA fragment and around the same fragment but with guanine being replaced with 8-oxoguanine (8oG, see Figure 4),21,22 or adenine being replaced with 8-oxoadenine (8oA, see Figure 4).23 Our goal was to identify differences in the electrostatic potential that might be relevant for enzymatic recognition of 8-oxo-purines. The electrostatic potential (EP) around DNA is nonuniform because the molecule possesses many polar groups specifically arranged in space. The analysis of electrostatic potential in three dimensions might be a very complex task. However, most of the important interactions take place near the surface of DNA. Hence, we focused our analysis on the values of the electrostatic potential at the “molecular surface”.22,23 For the purpose of our studies, we developed and implemented a cylindrical projection of EP, in which the values of the electrostatic potential at a complicated molecular surface of DNA are projected onto the walls of a cylinder, which is built around an approximate axis of a short DNA fragment. Of course, we do not assume that the DNA-enzyme interaction takes place on a cylindrical surface or that the DNA strand is cylindrical. The cylinder projection is only used as a tool to obtain 2D maps of EP around a short fragment of DNA. It was selected as a compromisesit is relatively simple to implement and the scale distortions are acceptable (e.g., when compared to a typical gnomonic projection). It should be emphasized that in our approach, we calculate
2200 J. Phys. Chem. B, Vol. 112, No. 7, 2008 differences in electrostatic potential, not the movement of atoms. Moreover, by using the cylinder projection, we can show the modifications of EP in one figure. The resulting two-dimensional (2D) electrostatic maps were then analyzed for the intact fragments of DNA (5′-TGT-3′ or 5′-AAC-3′) and the corresponding fragments containing, respectively, 8oG in the place of G or 8oA in the place of middle A. The main new features in the EP map resulting from the replacement of a purine with the corresponding 8-oxo-purine were reorganizations of the countercations and phosphate groups.22,23 In the current study, we analyze electrostatic potential around the 5′-ACTAG-3′ DNA pentamer and around an analogous pentamer with the central T replaced with Tg. We have demonstrated that the replacement of T with Tg leads to a significant reorganization of the DNA fragment, including shifts of the C and A bases, which are π-stacked with Tg, as well as a reorganization of the backbone with the Tg lesion. The reorganization is manifested by a movement of countercations and phosphate groups. In contrast to previous studies,22,23 we improved our approach to the analysis of EP maps by employing advanced image analysis methods.24 They can detect important features on the EP maps and measure differences between the EP maps based on T or Tg. Our results demonstrate that image analysis techniques might be very useful in chemical applications for biomolecules and extended systems. 2. Methods 2.1. Molecular Structure and Introduction of DNA Lesions. To obtain the structure of intact and damaged DNA fragments, we used a similar approach as that described before.21-23,25 However, the electronic structure calculations were performed for DNA pentamers rather than for trimers. Tg is a much more complex lesion (Figures 2 and 3) than the 8-oxopurines (Figure 4) that we studied in the past.22,23 In particular, Tg distorts the DNA structure more strongly than 8-oxo-purines, primarily due to the steric frustration of π-π stacking interactions. In order to accommodate these more profound structural relaxation effects, our computational model was extended to five pairs of nucleic acid bases connected by sugar-phosphate backbones. From a starting molecule, the 5′-GGGAACAACTAG-3′ DNA dodecamer described by Arnott et al.,26 we cut a 5′-ACTAG-3′ fragment (a pentamer) containing five nucleic base pairs. To limit the size of the system, the phosphate groups were removed from the 5′ and 3′ ends, and the strands were saturated with OH groups. The sodium countercations were initially placed at each of four phosphate groups at the position optimal for the gas-phase NaH2PO4. The resulting structure was a starting point for density functional theory (DFT) optimizations using a B3LYP exchangecorrelation functional.27-29 In order to suppress computational artifacts resulting from (i) using a pentamer as a model of the DNA double strand and (ii) a failure of B3LYP to reproduce intermolecular dispersion interactions,30 we introduced additional constraints on the pentamer. The atoms of the top and bottom nucleic acid base pairs were fixed in space during the optimization.21,22,23,25 This approach guarantees that the B3LYP failure to reproduce the π-π stacking interaction does not lead to unphysical geometries of the pentamer. The 6-31G(d,p) basis set was used for all geometrically optimized atoms and the 6-31G basis set31 for all geometrically fixed atoms resulting in 3511 basis functions for the 324 atom system. In the next step, we introduced a lesion. The thymine
Haranczyk et al. TABLE 1: Values of van der Waals Radii for H, N, C, O, P and Ionic Radius of Na (in Å) element radius a
H
C
N
O
P
Na
1.62a
2.04a
1.93a
1.82a
2.22a
1.02b
Ref 41. b Ref 42.
of the middle layer of the intact 5′-ACTAG-3′ pentamer was replaced with Tg by adding two OH groups to C5 and C6 of thymine. The initial interatomic distances and angles for Tg in the DNA fragment were the same as those resulting from the B3LYP/6-31G(d,p) optimization for an isolated thymine glycol connected to a sugar (sTg2). For the purpose of this study, we considered three rotamers of the thymine glycol, Tg1, Tg2, and Tg3 (Figure 2). First, a pentamer 5′-ACTg2AG-3′ was optimized at the B3LYP level with the same mixed 6-31G(d,p)/6-31G basis set, and again the top and bottom pairs of nucleic acid bases were fixed in space. Next, we converted Tg2 into Tg1 and Tg3 in the optimized 5′-ACTg2AG-3′ structure and the resulting initial structures for 5′-ACTg1AG-3′ and 5′-ACTg3AG-3′ were optimized at the same fashion. All DFT geometry optimizations were preformed using the NWChem 4.5-4.7 program package31 on a cluster of dual Intel Itanium2 nodes with Quadrics ELAN4 interconnect. On average, one step of geometry optimization for the DNA pentamer performed on 128 processors took 1.5 h. Structural optimizations required approximately 200 geometrical steps with the convergence criteria being set to default values (the maximum and root-mean-square gradient in the coordinates being set to 0.00045 and 0.00030 au, respectively). The NWChem calculations used 128 Gb of memory and 1Tb of disk space. On the basis of results obtained with 64 and 256 processors, we conclude that the calculation time scales nearly linearly with the number of processors involved. The detailed benchmarking of different tasks performed with the NWChem code on various machines is presented on the NWChem website.32 The Extensible Computational Chemistry Environment (ECCE),33,34 which provides a sophisticated graphical user interface and scientific visualization tools, was particularly useful to set up calculations for systems with “frozen” atoms and mixed-basis sets. 2.2. Regular Cylindrical Projection of Electrostatic Potential. 2.2.1. The Idea of Regular Cylindrical Projection. Analysis of electrostatic potential for a molecule is a complicated process. One needs to look at the potential created by charges in the 3D space around the molecule. A simplification is provided by a technique that is called a regular cylindrical projection, which is similar to gnomonic projection methods.35-39 Here, a fragment of the DNA molecule is positioned at the center of a cylinder and the axis of DNA is aligned with the axis of the cylinder. The cylindrical shape matches well the structure of a short DNA fragment. The values of EP at the surfaces of DNA fragments are projected onto the side wall of the cylinder. It is not necessary to project EP onto the top and bottom of the cylinder because, in the case of DNA, these areas are occupied by other layers of base pairs. 2.2.2. Definition of Molecular Surface. In this study, we assumed that the molecular surface is a solVent-accessible surface spanned by spheres centered on atomic nuclei with excluded cavities, as they are not accessible to solvent molecules. The exclusion of cavities not accessible to solvent was done by raining down a probe upon the atoms and stopping the probe just before a collision (van der Waals overlap) would occur. This approach is similar to that presented by Greer and Bush.40 In our implementation, however, the direction of an
Analysis Tools for Damaged DNA
J. Phys. Chem. B, Vol. 112, No. 7, 2008 2201
Figure 6. Points on the projection cylinder (A) and the molecular surface (B).
Figure 5. Details of cylindrical regular projection.
approaching probe is perpendicular to the axis of the DNA fragment. We have used a probe with a 1.3 Å radius to simulate a solvent molecule. The radii for the C, N, O, H, and P atoms are based on van der Waals radii used in the MM3 force field.41 The radius for Na is based on its ionic radius.42 All radii used by us are presented in Table 1. 2.2.3. The Axis of DNA. In our approach, a cylinder is built around the DNA axis. The axis is defined by a point and a vector. For consistency of the EP analysis, they both are kept unchanged when analyzing both the intact and damaged DNA fragments. The point is defined by the geometrical center of the intact DNA fragment. Since our first application of the cylindrical projection to the analysis of EP around DNA,21,22 we have tried and discussed a number of ways to calculate the direction vector of the DNA axis.23 We concluded that the most reliable approach is to derive the axis by diagonalization of the tensor of inertia. The eigenvector corresponding to the largest eigenvalue defines the axis of the DNA molecule. This approach works, however, only for molecules with a clearly elongated shape. Therefore, we used the starting DNA dodecamer described in section 2.1 to derive the axis. 2.2.4. Points on the Molecular Surface. First, we uniformly distribute points on the side wall of the cylinder. It is done in two loops: (i) over the height of the cylinder along the z-axis (Figure 5), and (ii) over the angle R in the xy plane (Figure 5). For each step in height (along the z-axis), the R vector (radius of the cylinder is set to 20 Å) is rotated around the z-axis by an angle R and the resulting point on the cylinder is recorded (Figure 6 A). Each selected point on the cylinder defines a point on the DNA surface (Figure 6 B). To identify the point on the
DNA surface, one walks along R toward the cylinder axis until one encounters the solvent-accessible DNA surface. This procedure defines a point defined by R′ (Figure 5). The final electrostatic potential map consists of 22 320 points. The angular coordinate was scanned from 0 to 359.5° with a step of 0.5°. The z coordinate was scanned with 31 steps of 0.5 Å each. The starting point of the z coordinate scan was set to 7.5 Å below the geometrical center of the intact pentamer. 2.2.5. Electrostatic Potential on the DNA Surface and its 2D Map. At each point on the molecular surface, we calculated the EP value directly from the B3LYP/6-31G(d,p) electron density. These calculations were done with the Gaussian03 program43 on an SGI Altix machine. The calculation of the largest system, the DNA pentamer with a thymine glycol (328 atoms, 3794 basis functions), took ∼8 h on 12 Intel Itanium processors using 50 Gb of memory. The final 2D electrostatic potential maps are presented in Figure 7. The angle R and the z coordinate of each point are displayed on the horizontal and vertical axis, respectively. The color is assigned according to the value of the electrostatic potential. Red and blue points have positive and negative values of the EP, respectively. EP maps are stored in standard bitmap files (bmp format). the color of each pixel of a bitmap is a superposition of three colors and it is stored in 24-bits of data, 8-bits (256 possible values) to code each of the red, green, and blue channels (RGB). However, in the presented results, only two channels are used to store information (one channel for positive values of EP, the other for negative values of EP), therefore the values of EP changing from -0.49 to 0.49 au are represented by 511 colors. The maps generated with (720 × 31) points are scaled to a higher resolution (720 × 1201) using a bi-directional cubic spline interpolation, as implemented in the FreeImage library.44 One should have in mind that such scaling distorts the original ratio between height and length of the map, and it does not reflect the true ratio between the height of the DNA fragment and its circumference. The origin of important features of the resulting EP maps was interpreted based on geometrical structures of the molecules and their accompanying molecular surfaces (Figures 6 B), as well as on topographical maps
2202 J. Phys. Chem. B, Vol. 112, No. 7, 2008
Haranczyk et al.
Figure 7. Electrostatic potential maps of 5′-ACTAG-3′ (A), 5′-ACTg1AG-3′ (C), 5′-ACTg2AG-3′ (D), and 5′-ACTg3AG-3′ (E) pentamers followed by difference maps (F-H). The angular coordinate R is placed on the horizontal axis and the height z is on the vertical axis with zero set to the utmost AT pair. The dominant structural features of DNA fragments are marked in the B map, with the sodium countercations labeled 1-8. Important features of thymine and thymine glycol are encircled on the maps A and C-H and labeled a and b. One should keep in mind that the 0° to 359.5° range on the horizontal axis corresponds to the circumference of the DNA double helix, which is approximately 60 Å. Thus, the maps are “compressed” in the horizontal dimension.
described in Section 2.2.7. A quantitative analysis of the EP maps was done using the image analysis techniques described in the following section. 2.2.6. Image Analysis Methods Employed in Analysis of EP Maps. We have created specific computer vision tools to quantitatively measure differences in the EP maps presented in Figure 7, parts A and C-E. These tools, supported by various digital image processing algorithms, provide the following general facilities: 1. Segmentation (detection) of red features corresponding to countercations in the EP map images (Figure 7 B); 2. Measurement of the displacement of red features (countercations) among different EP maps; 3. Segmentation (detection) of fuzzy features, a and b in Figure 7, parts A and C-E, in the EP map images;
4. Description of the features previously segmented in terms of size and intensity. The segmentation of the red features (countercations) has been achieved by applying a threshold based segmentation technique to the Blue channel of the EP map images. Due to the colorscale used to create the EP maps, the Blue channel of the input image is a high contrast gray scale level image containing only the countercations. A threshold for the segmentation has been chosen using the P-tile method,45 because in this specific application, we know approximately what percentage of pixels in the image comes from the objects of interest. A high contrast between the red features (countercations) and the background makes the used threshold segmentation method robust and fast. For this reason, there was no need to consider alternative segmentation methods.
Analysis Tools for Damaged DNA
J. Phys. Chem. B, Vol. 112, No. 7, 2008 2203
Figure 8. Topographical maps of the intact (A) and damaged (B-D) DNA surfaces.
TABLE 2: Positions of Na+ Countercations in the Reference EP Map of the Intact DNA (5′-ACTAG-3′) and Corresponding Displacement Vectors for the Damaged DNA Containing Tg1-Tg3 Obtained Using Image Analysis and Motion Detection Algorithms displacement vectors
displacement in 3D
counter cation
initial position (5-ACTA G-3')
5-ACTg1 AG-3
5-ACTg2 AG-3
5-ACTg3 AG-3
5-ACTg1 AG-3
5-ACTg2 AG-3
5-ACTg3A G-3
1 2 3 4 5 6 7 8
(9.69 , 27.5) (6.20 , 67.5) (2.87 , 100.5) (12.16 , 89.5) (8.31 , 223.0) (4.68 , 257.5) (1.49 , 300.0) (10.54 , 344.0)
(-0.137 , -0.5) (-0.125 , -0.5) (-0.062 , -0.5) (-0.337 , -1.0) (-0.137 , 2.5) (0.437 , -0.5) (0.025 , 0.0) (0.175 , 0.5)
(-0.100 , -0.5) (-0.112 , -0.5) (-0.050 , 0.0) (-0.412 , -1.0) (-0.175 , 2.5) (0.350 , -0.5) (0.038 , 0.0) (0.137 , 0.0)
(-0.125 , -0.5) (-0.125 , -0.5) (-0.062 , 0.0) (-0.350 , -1.0) (-0.137 , 2.5) (0.412 , 0.0) (0.037 , 0.0) (0.175 , 0.5)
0.240 0.189 0.074 0.405 0.534 0.417 0.076 0.178
0.182 0.182 0.058 0.473 0.586 0.335 0.063 0.134
0.220 0.183 0.069 0.433 0.574 0.385 0.071 0.169
a Positions and vectors in “(x, y)” format where the x coordinate is in Å and the y coordinate is in degrees. Displacements of the Na+ countercations in 3D (in Å) were obtained through differences in the calculated Cartesian coordinates of the corresponding atoms and are shown for comparison. No further approximations or projections were involved in obtaining the 3D displacements.
TABLE 3: Mean Electrostatic Potential and the Size of the Important a and b Features Identified in the Major Groove of Intact and Damaged DNA Pentamers feature
mean value of EP (in a.u)
shape width (in °)
feature a
mean EP value
standard deviation
At 10.5 Å
At 9.0 Å
At 7.5 Å
length (in Å)
5-ACTAG-3 5-ACTg1AG-3 5-ACTg2AG-3 5-ACTg3AG-3
0.396 0.404 0.405 0.395
0.008 0.007 0.006 0.007
10.0 15.5 13.5 13.5
11.0 7.0 11.0 20.5
6.5 0.0 0.0 6.5
4.55 3.31 3.90 4.21
feature b (does not exist in 5-ACTAG-3) 5-ACTg1AG-3 5-ACTg2AG-3 5-ACTg3AG-3
mean EP value
standard deviation
area (in deg * Å)
max width (in deg)
max length (in Å)
0.417 0.405 0.406
0.007 0.008 0.007
28.39 32.43 48.22
16.0 18.0 25.5
2.25 2.32 2.45
Once segmented, the countercations are automatically labeled using a morphological operator46 (1-8 in Figure S-1 of the Supporting Information), and their positions are computed as the centers of gravity (centroids) of the detected object. Displacements of countercations having the same labels but belonging to different EP maps are finally computed as the distances between their centroids computed along the axes z and R. The results are presented in Table 2. The segmentation technique used to detect two fuzzy features labeled as a and b in Figure 7, parts A and C-E, exploits the extended h-minima transform,46,47 a morphological filter that adopts a contrast criterion in the selection of minima from an image. The filter has been applied to the encircled areas (Figure 7, parts A and C-E) lying at 6 - 12 Å on the z-axis, and 150 - 210° on the R axis, and the threshold h was automatically estimated analyzing the vertical and horizontal intensity profiles of the
objects we wanted to detect. In order to see more details in each of the EP maps and estimate a more effective and consistent threshold h, the contrast foreground/background of the portion of image of interest has been initially enhanced by expanding its intensity histogram. The contrast enhancement technique exploited is well-known as the histogram adjustment.24 Once the fuzzy features a and b are segmented, see Figure S-2 of the Supporting Information, the EP values are calculated by measuring the mean intensity value, the standard deviation and the shape of each object detected. The results are presented in Table 3. The image analysis was preformed using the MATLAB (version R2006b)48 and functions available in the image processing toolbox. 2.2.7. Topographical Maps of the DNA Surface. Similar to the EP maps, we have generated topographical maps of the DNA
2204 J. Phys. Chem. B, Vol. 112, No. 7, 2008
Haranczyk et al.
Figure 9. Two views of the superimposed intact (gray) and damaged (red) DNA pentamers depicturing differences in their geometries. The largest changes are marked with arrows.
surface (Figure 8). The angle R and the z coordinate of each point are displayed on the horizontal and vertical axis, respectively. For each point on the surface, we found the closest atom. The color is assigned according to the value of the corresponding atomic number. The maps generated with (720 × 31) points are scaled to a higher resolution (720 × 1201). 3. Results First, the geometrical differences between the intact and damaged DNA fragments were analyzed. Since both fragments had the same top and bottom nucleic acid base pairs fixed in space during the geometry optimization, it is possible to superimpose these pairs and compare the differences in the remaining molecular parts (Figure 9). The structural differences between the DNA fragments containing T and Tg are mostly limited to a strand with the lesion. The Tg is more bulky than T because of the OH groups attached to the sp3 C5 and C6 carbons (otherwise sp2 in T). Since the OH groups of Tg require more space than the methyl group on the opposite side of the Tg ring, the Tg is displaced along the DNA axis in the direction of the methyl group in comparison with the position of T (Figure 9). The presence of a bulky Tg also affects positions of the neighboring, π-stacked C and A bases, which become separated from each other by a larger distance than in the intact DNA. Significant reorganization of the phosphate groups with sodium countercations might also be observed (Figure 9). The countercations are dragged by the phosphate groups as the whole backbone elongates along the z-axis because of the larger separation between the C and A bases. The differences in geometry between the pentamers containing Tg1-Tg3 are minimal and practically limited to a fragment containing the nucleoside of Tg, and neighboring π-stacked bases (Figure S-3 of the Supporting Information). Our calculations suggest that the pentamers with Tg2 and Tg3 are isoenergetic, whereas the 5′-ACTg1AG-3′ pentamer is less stable by 4.9 kcal/mol. The cylindrical projections of EP around the DNA fragments containing T, Tg1, Tg2 and Tg3 show multiple features (Figure 7, parts A and C-E). The differences between the EP maps are analyzed in the context of chemical differences between the intact and damaged DNA fragments, which are hidden under the DNA surfaces presented in Figure 6 B. The most important
structural features are marked in Figure 7 B. The most distinctive features are the sodium countercations that are represented by eight big red spots of a positive electrostatic potential and labeled 1-8 in Figure 7 B. The positions of countercations in the reference map for 5′-ACTAG-3′ and the displacement vectors for DNA fragments with Tg, as read form the map using the image analysis, are summarized in on the left side of Table 2. For comparison we also included corresponding displacement in 3D as taken from Cartesian coordinates of atoms and the results are presented on the right side of Table 2. The sugarphosphate backbones are represented by two elongated shapes of a neutral potential (Figure 7 B). The minor and major grooves might be seen between the two strands (Figure 7 B). The grooves are characterized by a negative potential. A contribution from the intact T in 5′-ACTAG-3′, labeled a in Figure 7 A, can be found in an encircled area. This contribution is represented as a spot of negative potential, which is ∼4.55 Å long and up to 11° wide (Table 3). It is associated with an oxygen atom, as shown in Figure 8, and the identity of this atom is O8 of thymine (Figure 1). At the first glance, the EP maps of the DNA fragments with Tg1-Tg3 (Figure 7C-E) look very much like the EP map of the intact fragment (Figure 7A). The sodium countercations, strands, and grooves might be identified in both cases. The intensity of colors, reflecting the values of EP, is very similar except in the middle part of the plot. This suggests that the lesion did not affect much the DNA structure. However, new features characteristic to Tg1-Tg3 can be identified in the major groove. A spot of negative potential, which is encircled in the middle of the major groove, has a different structure for intact thymine (Figure 7A) and for Tgn’s (n ) 1-3) in Figures 7CE. There is a single feature a in Figure 7A, which is related to the O8 atom of thymine. In the EP maps with Tg1 and Tg2 the feature a is only 3.3 Å and 3.9 Å long, respectively. In turn, in the case of Tg3, the feature a is 4.2 Å long and wider than for Tg1 and Tg2 (Table 3). In Figure 7C-E there is an additional feature, labeled b, which results from the hydroxyl groups of Tg. The size of the spot depends on the rotamer of Tg. In the case of Tg1 and Tg2, only one of the hydroxyl groups points with an oxygen atom toward the surface, whereas the other
Analysis Tools for Damaged DNA hydroxyl group points with a hydrogen atom (Figure 8). For Tg3, however, both oxygens from the hydroxyl groups are pointing toward the DNA surface. Therefore, the negative potential spot is almost twice as large in area for Tg3 than for Tg1 and Tg2, with the maximum length and width of 2.45 Å and 25°, respectively (Table 3). The incorporation of a bulky Tg induces a geometrical relaxation of the phosphate backbone, which is followed by a displacement of sodium countercations. These differences are clearly seen in Figure 7F-H, which illustrate differences between the EP maps of the damaged and intact DNA fragments. The relocations along the DNA axis are the most profound for three countercations labeled 4, 5, and 6, which are in the close proximity of Tg. They displace away from the lesion by as much as 0.41, 0.18, and 0.44 Å, respectively, with 5 moving also along the angular coordinate by 2.5° (Table 2). The geometrical relaxations of other countercations are also noticeable as they move up to 0.18 Å along the DNA axis. The displacements along the z-axis observed on the EP maps are comparable to the corresponding displacements in the 3D space as shown in Table 2. It is noteworthy that the geometrical relaxation of countercations has a larger impact on the difference maps (Figures 7 F-H) than these modifications that are brought by the hydroxyl groups of Tg. The latter modifications are localized in the major groove (see encircled areas). For Tg1 and Tg2, the a feature is short and narrow in comparison with T and Tg3 due to a partial shielding by a hydrogen atom (Figure 8) of the Tg’s hydroxyl group. In Tg3 this shielding does not take place and a and b, which originate form oxygen atoms (Figure 8), become more distinct. It is important to note that although the negative EP features in the major groove in the intact and damaged DNA fragments are very different in terms of shape and size, the values of EP differ by less than 6% (Table 3). 4. Summary We have studied a fragment of intact DNA, 5′-ACTAG-3′, and its counterpart with T replaced with one of three rotamers of Tg (5′-ACTg1AG-3′, 5′-ACTg2AG-3′, and 5′-ACTg3AG3′) using the B3LYP/6-31G(d,p) method. The geometries of 5′ACTAG-3′, 5′-ACTg1AG-3′, 5′-ACTg2AG-3′, and 5′-ACTg3AG3′ were optimized with a constraint of the fixed top and bottom base pairs. The electrostatic potential around these DNA fragments was studied using the regular cylindrical projection technique. The values of electrostatic potential at the molecular surfaces were projected onto the side walls of cylinders surrounding the DNA fragments. The resulting maps of electrostatic potential and optimized geometries of intact and damaged DNA fragments were compared using image analysis techniques. The results can be summarized as follows: 1. The replacement of T with Tg distorts primarily the damaged strand of DNA. The complementary strand is much less distorted. 2. Tg is more bulky than T because of the OH groups attached to the C5 and C6 carbons. The hybridization of these carbons changes from planar sp2 for T to nonplanar sp3 for Tg, with both OH groups of Tg being on the opposite side of the ring than the methyl group of T. Since the OH groups require more space than the methyl group, Tg is displaced along the DNA axis in the direction of the methyl group. The bulky and displaced Tg affects relative positions of the neighboring, π-stacked C and A bases, which become more separated in the DNA fragments with Tg.
J. Phys. Chem. B, Vol. 112, No. 7, 2008 2205 3. The differences in geometry between the DNA pentamers containing Tg1, Tg2, and Tg3 are limited to the Tg fragment only. 4. A reorganization of three phosphate groups closest to Tg and the accompanying countercations are important features of the Tg lesion. The relaxation of charged particles along the axis of the DNA fragment reaches up to 0.5 Å and is clearly reflected in the electrostatic potential maps. 5. The EP maps of pentamers are characterized by large spots of a negative electrostatic potential in the major groove. This spot is dominated by the O8 atom of T in the intact DNA fragment and acquires an additional structure in the DNA fragments with Tg. The new structure is related to the oxygen atoms from the two hydroxyl groups of Tg. The size of the additional feature depends on the rotamer of Tgsthe DNA fragment with Tg3 is characterized by a twice as large spot of the negative potential than the DNA fragments with Tg1 or Tg2. 6. The presence of a thymine glycol lesion induces much larger distortions of the DNA structure than the presence of 8-oxo-purine lesions studied before.22,23 In the latter cases the changes were limited to the damaged site only. We suggest that changes of electrostatic potential around the DNA molecule resulting from chemical modifications of nucleotides may play a role in enzymatic recognition of damaged sites. Our future study will focus on the analysis of the molecular shape of the damaged and undamaged DNA fragments. Molecular dynamics (MD) simulations will be performed to find a correlation between the distribution of countercations and the findings presented here. The tools presented here can be easily extended to the analysis of EP maps generated from snapshots of future MD simulations. The motion detection algorithm can be used to track and report the changes of EP as a function of simulation time. Acknowledgment. Helpful discussions with John H. Miller and Michel Dupuis are gratefully acknowledged. M.H. acknowledges support from British Council and Polish Ministry of Science and Education that covered the cost of visit at the University of Sheffield through Young Scientists Programme (WAR/342/86). This work was supported by the: (i) U.S. D.O.E. Office of Biological and Environmental Research, Low Dose Radiation Research Program, (ii) Polish State Committee for Scientific Research (KBN) Grant No. DS/8221-4-0140-7 (M.G.), and (iii) Polish State Committee for Scientific Research (KBN) Grant No. N204 127 31/2963 (M.H.). M.H. holds the Foundation for Polish Science (FNP) award for young scientists. I.D. acknowledges the Marie Curie Fellowship. The calculations were performed at the Academic Computer Center in Gdan´sk (TASK) and at the Molecular Science Computing Facility (MSCF) in the William R. Wiley Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the U.S. Department of Energy’s Office of Biological and Environmental Research and located at the Pacific Northwest National Laboratory, which is operated by Battelle for the US Department of Energy. The MSCF resources were available through a grand challenge project. Supporting Information Available: Color versions of Figures 1-4, 6, and 9. Also images of countercations (Figure S-1) and a and b spots from Figure 7C-E (Figure S-2) after segmentation. This information is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Dizdaroglu, M.; Jaruga, P.; Birincioglu, M.; Rodriguez. Free Radical Biol. Med. 2002, 32, 1102-1115.
2206 J. Phys. Chem. B, Vol. 112, No. 7, 2008 (2) Lutz, W. K. Mutation Res. 1990, 238, 287-295. (3) Ames, B. N. Mutation Res. 1989, 214, 41-46. (4) Ames, B. N. EnViron. Mol. Mutagen. 1989, 14, 66-77. (5) Ames, B. N.; Gold, L. S. Mutation Res. 1991, 250, 3-16. (6) Jaloszynski, P.; Jaruga, P.; Olinski, R.; Biczysko, W.; Szyfter, W.; Nagy, E.; Moller, L.; Szyfter, K. Free Radical Res. 2003, 37, 231-240. (7) Adelman, R.; Saul, R. L.; Ames, B. N. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 2706-2708. (8) Basu, A. K.; Loechler, E. L.; Leadon, S. A.; Essigmann, J. M. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 7677-7681. (9) McNulty, J. M.; Jerkovic, B.; Bolton, P. H.; Basu, A. K. Chem. Res. Toxicol. 1998, 11, 666-673. (10) Slupphaug, G.; Kavli, B.; Krokan, H. E. Mutation Res. 2003, 531, 231-251. (11) Cathcart, R.; Schniers, E.; Saul, R. L.; Ames, B. N. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 5633-5637. (12) Lustig, M. J.; Cadet, J.; Boorstein, R. J.; Teebor, G. W. Nucleic Acids Res. 1992, 20, 4839-4845. (13) Teebor, G.; Cummings, A.; Frenkel, K.; Shaw, A.; Voituriez, L.; Cadet, J. Free Rad. Res. Commun. 1987, 2, 303-309. (14) (a) Vaishnav, Y.; Holwitt, E.; Swenberg, C.; Lee, H.-C.; Kan, L.S. J. Biomol. Struct. Dyn. 1991, 8, 935-951. (b) Kao, J. Y.; Goljer, I.; Phan, T. A.; Bolton, P. H. J. Biol. Chem. 1993, 268, 17787-17793. (15) Iwai, S. Angew. Chem. Int. Ed. 2000, 39, 3874-3876. (16) Iwai, S. Chem. Eur. J. 2001, 7, 4343-4351. (17) Miaskiewicz, K.; Miller, J.; Osman, R. Int. J. Rad. Biol. 1993, 63, 677-686. (18) Haranczyk, M.; Dabkowska, I.; Gutowski, M. 11th Electronic Computational Chemistry Conference, http://eccc.monmouth.edu, April 2-31, 2007, Poster 47. (19) (a) Kusumoto, R.; Masutani, C.; Iwai, S.; Hanaoka, F. Biochemistry 2002, 41, 6090-6099. (b) Fischhaber, P. L.; Gerlach, V. L.; Feaver, W. J.; Hatahet, Z.; Wallace, S. S.; Friedberg, E. C. J. Biol. Chem. 2002, 277, 37604-37611. (20) (a) Miller, H.; Fernandes, A. S.; Zaika, E.; McTigue, M. M.; Torres, M. C.; Wente, M.; Iden, C. R.; Grollman, A. P. Nucleic Acids Res. 2004, 32, 338-345. (b) McTigue, M. M.; Rieger, R. A.; Rosenquist, T. A.; Iden, C. R.; de los Santos, C. R. DNA Repair 2004, 3, 313-322. (c) Katafuchi, A.; Nakano, T.; Masaoka, A.; Terato, H.; Iwai, S.; Hanaoka, F.; Ide, H. J. Biol. Chem. 2004, 279, 14464-14471. (21) Haranczyk, M.; Bachorz, R. A.; Dabkowska, I.; Dupuis, M.; Miller, J. H.; Gutowski, M. S. Computational Characterization of Lesions in DNA, 228th ACS National Meeting, U175-U175, 025-BIOL Part 1, Philadelphia, PA, August 22-26, 2004. (22) Haranczyk, M.; Gutowski, M. Theo. Chem. Acc. 2007, 117, 291296. (23) Haranczyk, M.; Miller, J. H.; Gutowski, M. J. Mol. Graph. Model. 2007, 26, 282-289. (24) Gonzales, R. C.; Woods, E. W. Digital Image Processing; AddisonWesley Publishing Company: Reading, MA, 1992. (25) Miller, J. H.; Aceves-Gaona, A.; Ernst, M. B.; Haran´czyk, M.; Gutowski, M.; Vorpagel, E. R.; Dupuis, M. Rad. Res. 2005, 164, 582585. (26) Arnott, S.; Hukins, D. W. L. Biochem. Biophys. Res. Commun. 1972, 47, 1504-1509. (27) Becke, A. D. Phys. ReV. A 1988, 38, 3098-3100. (28) Becke, A. D. J. Chem. Phys. 1993, 98, 5648-5652. (29) Lee, C.; Yang, W.; Paar, R. G. Phys. ReV. B 1988, 37, 785-789. (30) Kohn, W.; Meir, Y.; Makarov, D. E. Phys. ReV. Lett. 1998, 80, 4153-4156. (31) (a) Straatsma, T. P.; Apra, E.; Windus, T. L.; Bylaska, E. J.; de Jong, W.; Hirata, S.; Valiev, M.; Hackler, M.; Pollack, L.; Harrison, R.; Dupuis, M.; Smith, D. M. A.; Nieplocha, J.; Tipparaju, V.; Krishnan, M.; Auer, A. A.; Brown, E.; Cisneros, G.; Fann, G.; Fru¨chtl, H.; Garza, J.; Hirao, K.; Kendall, R.; Nichols, J.; Tsemekhman, K.; Wolinski, K.; Anchell,
Haranczyk et al. J.; Bernholdt, D.; Borowski, P.; Clark, T.; Clerc, D.; Dachsel, H.; Deegan, M.; Dyall, K.; Elwood, D.; Glendening, E.; Gutowski, M.; Hess, A.; Jaffe, J.; Johnson, B.; Ju, J.; Kobayashi, R.; Kutteh, R.; Lin, Z.; Littlefield, R.; Long, X.; Meng, B.; Nakajima, T.; Niu, S.; Rosing, M.; Sandrone, G.; Stave, M.; Taylor, H.; Thomas, G.; van Lenthe, J.; Wong, A.; Zhang, Z. NWChem, A Computational Chemistry Package for Parallel Computers, Version 4; Pacific Northwest National Laboratory: Richland, Washington 2004. (b) Kendall, R. A.; Apra`, E.; Bernholdt, D. E.; Bylaska, E. J.; Dupuis, M.; Fann, G. I.; Harrison, R. J.; Ju, J.; Nichols, J. A.; Nieplocha, J.; Straatsma, T. P.; Windus, T. L.; Wong, A. T. Computer Phys. Comm. 2000, 128, 260283. (32) http://www.emsl.pnl.gov/docs/nwchem/benchmarks/index.html as of Jun 10th 2007. (33) Black, G.; Didier, B.; Bisethagen, T.; Feller, D.; Gracio, D.; Hackler, M.; Havre, S.; Jones, D.; Jurrus, E.; Keller, T.; Lansing, C.; Matsumoto, S.; Palmer, B.; Peterson, M.; Schuchardt, K.; Stephan, E.; Sun, L.; Taylor, H.; Thomas, G.; Vorpagel, E.; Windus, T.; Winters, C. Ecce, A Problem SolVing EnVironment for Computational Chemistry, Software, Version 3.2.5; Pacific Northwest National Laboratory: Richland, Washington 2006. (34) Black, G. D.; Schuchardt, K. L.; Gracio, D. K.; Palmer, B. The Extensible Computational Chemistry Environment: A Problem Solving Environment for High Performance Theoretical Chemistry, Computational, Science; ICCS International Conference Saint Petersburg, Russian Federation, Melbourne, Australia, Proceedings 2660, vol. 81, ed. Sloot, P. M. A.; Abramson, D.; Bogdanov, A. V.; Dongarra, J. Springer-Verlag: Berlin, Germany, 2003, pp. 122-131. (35) Van Geerestein, V. J.; Perry, N. C.; Grootenhuis, P. G.; Haasnoot, C. A. G. Tetrahedron Comput. Methodol. 1990, 3, 595-613. (36) Perry, N. C.; van Geerestein, V. J. J. Chem. Inf. Comp. Sci. 1992, 32, 607-616. (37) Chau, P.-L.; Dean, P. M. J. Mol. Graph. 1987, 5, 97-100. (38) Blaney, F. E.; Edge, C.; Phippen, R. W. J. Mol. Graph. 1995, 13, 165-174. (39) Leach, A. R.; Villet, V. J. An Introduction to Chemoinformatics; Kluwer Academic Publishers: Norwell, MA, 2003; p 118. (40) Greer, J.; Bush, B. Proc. Natl. Acad. Sci. U.S.A. 1978, 75, 303307. (41) Allinger, N. L.; Zhou, X.; Bergsma, J. J. Mol. Struct. (Theochem) 1994, 312, 69-83. (42) Shriver, D. F.; Atkins, P. W.; Langford, C. H. Inorganic Chemistry, 2nd edition; Oxford University Press: Oxford, 1995, p 35. (43) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Jr.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A. Gaussian 03, revision C.02; Gaussian, Inc.: Wallingford, CT, 2004. (44) http://freeimage.sourceforge.net. (45) Yasuno, M.; Yasuda, N.; Aoki Computer Vision and Pattern Recognition Workshop, 27-02 June 2004, Conference proceedings p. 125. (46) Soille, P. Morphological Image Analysis: Principles and Applications; Springer-Verlag: New York, 1999. (47) Wa¨hlby, C.; Sintorn, I. M. J. Microscopy 2004, 215, 67-76. (48) http://www.mathworks.com/products/matlab/