4565
Stereochemistry of Nucleic Acids and Their Constituents. XVII.la Crystal and Molecular Structure of Deoxycytidine 5'-Phosphate Monohydrate.lb A Possible Puckering for the Furanoside Ring in B-Deoxyribonucleic Acidic M. A. Viswamitra,*ld B. Swaminatha Reddy,ld George Hung-Yin Lin,le and M. Sundaralingam*" Contributionf r o m the Department of Physics, India Institute of Science, Bangalore 12, India, and the Department of Biochemistry, University of Wisconsin, Madison, Wisconsin 53706. Received November 30, 1970 Abstract: The crystal structure of deoxycytidine 5 '-phosphate monohydrate (5 '-dCMP), CaH:rNaO,P. H20, has been determined by X-ray diffraction techniques. Two entirely independent investigations were carried out. In one case, 1149 intensity data were collected by multiple-film equiinclination Weissenberg technique using C u K a radiation. In the other case, 1248 three-dimensional intensity data were collected on a four-circle diffractometer, also using Cu Ka radiation. In both cases the structure was refined by full-matrix least-squares techniques. The final R values for the two investigations were 0.070 (from film data) and 0.035 (from diffractometer data); in the latter case, when secondary extinction corrections were applied the R value was reduced t? 0.023. The unit cell is orthorhombic, space group P212121,with average a = 6.786, b = 11.345, and c = 16.769 A. The nucleotide exists as a zwitterion with N(l) protonated by one of the phosphate protons. A comparison of the geometries of the N(1)-protonated and neutral cytosine derivatives shows that there are marked differences in the bond angles and bond distances involving N(l), C(2), and C(4). A novel glycosyl torsion angle, -6.0", and sugar conformation C(3')ex0-C(2')ex0,where both C(3 ') and C(2') are on the opposite side of C(5 9, are observed in 5'-dCMP. The conformation about the C(4')-C(5 ') bond is gauche-gauche, the only conformation observed so far for the known 5 '-nucleotides. A correlation between the furanoside ring conformations and the glycosyl torsion angles is presented. All available hydrogens participate in the hydrogen bonds. The molecules related by the screw axis parallel t o the a axis are linked together by pairs of hydrogen bonds, resulting in an infinitely extended spiral around the screw axis. The crystal structure can be considered essentially a close packing of these infinite spirals linked by hydrogen bonds to each other and the water of crystallization.
T
he stereochemistry of n o n e of t h e f o u r deoxyribonucleotides, t h e basic m o n o m e r units of DNA (deoxyribonucleic acid), h a s been reported in t h e literature, a l t h o u g h the structure of the calcium salt of thymidylic acid w a s reported by Trueblood, Horn, a n d LuzzatL2 T h i s p a p e r describes the results of two entirely independent investigations of t h e structure of deoxycytidine 5 '-phosphate ( 5 ' - d C M P ) m o n o h y d r a t e (Figure 1). Since different experimental techniques h a d been used, it w a s considered a p p r o p r i a t e to r e p o r t both sets of results in t h e same p a p e r a n d to m a k e t h e necessary comparison between them. For convenience, we shall divide t h e experimental portions a n d t h e results of this r e p o r t i n t o two parts, describing t h e work of Viswamitra a n d R e d d y (VR) a n d t h a t of Lin a n d S u n d a r a l i n g a m (LS).
Experimental Section The crystals for both investigations were prepared by VR. Crystals of deoxycytidine 5'-phosphate monohydrate are colorless prisms; some of them are as long as 10 mm. They were grown by slow diffusion of acetone (both liquid and vapor) through water solutions containing a wide range of concentrations of deoxycytidine (1) (a) Part XVI of this series of papers is by J. A. Carrabine and M. Sundaralingam, Biochemistry, 10, 292 (1971). (b) Abbreviations used : cytidine 3 '-phosphate, 3 '-CMP; cytidine 3'-phosphate, orthorhombic form, 3'-CMP(0); cytidine 3'-phosphate, monoclinic form, 3 '-CMP(M); deoxyribonucleic acid, DNA. (c) A preliminary communication on this structure has been published: M. A. Viswamitra and B. S . Reddy, 2.Krisfullogr., 131, 237 (1970). (d) India Institute of Science. ( e ) University of Wisconsin. (2) K . N. Trueblood, P. Horn, and V. Luzzati, Acta Crystallogr., 14, 965 (1961).
5'-phosphate. Crystal data for the two investigations are compared in Table I. Table I. Crystal Data for Deoxycytidine 5'-Phosphate Monohydrate VR Space group a
b C
V
z dobad dealmi
~ ( C Ka) U
6.776 i 0 . 0 0 2 A 11.340 i 0 . O 3 A 16.772 i 0.006A 1289.0 AS 4 1.687 g ~ m - ~ 1.675 g ~ m - ~ 23.63 cm-1
LS P212121 6.796 f 0.002A 11.349 & 0.W3 A 16.76: i 0.004A 1292.9 AS 4 1 ,669 g (3111-3 1.671 g ~ m - ~ 23.54 cm-1
VR. The intensity data were collected on a nearly cylindrical crystal of 0.32 mm diameter and about 1 mm long, by the multiplefilm equiinclination Weissenberg technique, for layers Okl-5kl and k01 using Cu K a radiation. A total of 1149 reflections was collected, comprising about 80% of the copper sphere. These intensities were estimated visually by comparison with a calibrated film strip. The data were then corrected for Lorentz and polarization factors and placed on a common arbitrary scale using the hOl data. Absolute scaling of reflections was done by a Wilson plot. No correction for absorption was made. LS. Three-dimensional intensity data in the range 0" 5 2 5 127" were collected on a Picker four-circle automated diffractometer with Ni-filtered Cu K a radiation. The 6-26 scan mode with a scan rate of 2"/min was employed for the data collection. Background counts of 20 sec were measured at each end of the scan. Three standard reflections were checked at an interval of every 100 reflections, and they showed fluctuations in intensity of only *2%. The data were corrected for Lorentz and polarization effects, but no
Sundaralingam, et al.
Deoxycytidine 5 '-Phosphate Monohydrate
4566 Table JI. Positional Parameters and Anisotropic Thermal Parameters of the Nonhydrogen Atoms54 Atom
X
3003 (11) 3009 (3) 4312 (12) 4308 (4) 3540 (4) 3537 (3) 1582 (12) 1585 (4) 296 (13) 293 (4) 1056 (13) 1049 (4) 6039 (9) 6043 (3) 918 (12) 951 (4) 3802 (13) 3813 (4) 5474 (13) 5455 (4) 4839 (13) 4847 (4) 2641 (14) 2633 (4) 2262 (9) 2256 (2) 5680 ( 10) 5691 (3) 1495 (15) 1499 (5) 2151 (10) 2147 (3) -1296 (10) - 1306 (3) 1034 (10) 1049 (3) 1676 (10) 1664 (3) 832 (4) 830 (1) 3571 (10) 3580 (3)
Y
z
- 198 (6) - 209 (2) 505 (7) 494 (2) 1527 ( 5 ) 1532 (2) 1822 (8) 1807 (2) 1052 (7) 1039 (2) 54 (7) 49 (2) 242 ( 5 ) 245 (2) 2760 (6) 2759 (2) - 1360 (3) - 1359 (2) - 1182 (7) - 1203 (2) - 2003 (7) -2011 (2) - 2070 (7) - 2062 (2) - 1963 (5) - 1965 (1) -3165 ( 5 ) -3160 (1) -1161 (7) -1152 (2)
-73 (4) -71 (1) -476 (4) -479 (1) -795 (4) -794 (1) -787 (4) -788 (1) -383 (4) -380 (1) - 34 (4) -45 (1) -552 (3) - 559 (1) - 1159 (4) - 1160 (1) 241 (4) 244 (1) 846 (4) 858 (1) 1551 ( 5 ) 1539 (1) 1466 (5) 1464 (1) 615 (3) 613 (1) 1412 (3) 1402 (1) 1935 ( 5 ) 1928 (2) 1700 (3) 1706 (1) 1819 (4) 1817 (1) 2902 (3) 2892 (1) 1525 (4) 1530 (1) 1963 (1) 1961 (0) 1963 (4) 1973 (1)
5 (5)
19 (1) 832 ( 5 ) 833 (2) 1216 ( 5 ) 1219 (2) 2145 ( 5 ) 2155 (2) 1122 (2) 1121 (1) -4977 (5) -4980 (2)
Pa
811
124 (18) 111 (6) 100 (20) 120 (7) 152 (19) 113 (6) 133 (20) 128 (7) 175 (22) 116 (7) 157 (24) 106 (7) 147 (16) 105 ( 5 ) 165 (20) 140 (6) 118 (22) 136 (7) 183 (23) 145 (7) 138 (22) 154 (7) 207 (25) 169 (8) 186 (16) 133 ( 5 ) 197 (17) 157 (5) 298 (31) 237 (9) 215 (18) 160 (6) 161 (18) 124 ( 5 ) 199 (19) 152 (5) 231 (18) 182 (6) 154 (8) 115 (2) 221 (18) 185 (6)
Pas
Pia
612
PlC
2 (14) -4 (3) - 10 (18) - 13 (4) - 13 (14) -4 (3) -3 (7) -2 (4) -8 (17) -4 (3) -9 (20) - 12 (4) o (14) 2 (3) 14 (17) 5 (3) -4 (16) - 2 (3) -11 (18) -8 (4) - 1 (17) 11 (4) 17 (18) 0 (4) -23 (13) - 20 (3) 16 (14) 15 (3) 27 (21) 20 (4) lO(13) 11 (3) 2 (14) 3 (3) 1(16) 1(3) - 3 (14) -1 (3) 10 (4) 9 (1) -4 (15) -4 (3)
a All parameters and their standard deviations given in parentheses have been multiplied by 10'. * For each atom the results from VR and LS are given in the first line and second line, respectively. The temperature factor is of the form exp[-(Pilh2 . . . 2P12hk . . .)I.
+
Figure 1. Numbering in deoxycytidine 5'-phosphate. absorption corrections were made (g = 23.5 cm-* for Cu K a radiation). A reflection was considered unobserved if the intensity was smaller than 1.5 times its standard deviation. On this basis, 1197 reflections were considered to be observed out of the total of 1248 reflections. Structure Determination. VR. The position of the phosphorus atom ( x = 0.083, y = 0.112, z = 0.196) was determined unambiguously from the Harker sections of a sharpened three-dimensional Patterson synthesis. A three-dimensional minimum function computed on the basis of the position of the phosphorus atom
Journal of the American Chemical Society
1 93:18 1 September
+
+
showed clearly the four oxygen atoms of the PO4 group. However, the peaks for the remaining 16 heavy atoms could not be identified unambiguously since the minimum function had far too many peaks. A three-dimensional Fourier synthesis based on the phases of the phosphate group showed about the same number of peaks as found in the minimum function. It was noticed that among the Okl structure factors calculated using the PO4group, the reflection (024) was calculated almost zero while it was observed to be the strongest. A Bragg-Lipson chart of this reflection, when superposed on the (100) Fourier projection computed with the phases of the PO4 group indicated the sign of (024) to be positive. From this map we also picked out the ten strongest peaks likely to represent atom sites. A (100) Fourier projection computed incorporating this new information, however, did not show the molecule unambiguously. At this stage a three-dimensional model of the molecule was made with the atoms located at the peaks common to both the 3d minimum function and the 3d Fourier synthesis and also satisfying the (100) Fourier projection. This procedure gave in one single step the positions of all the remaining nonhydrogen atoms including that of the water oxygen. LS. A sharpened three-dimensional Patterson map was computed and the position of the phosphorus atom was derived from the Harker peaks. Based on the contribution of the phosphorus atom to the structure the phase angles were calculated. Then the phases were refined by applying Karle and Hauptman's tangent f o r m ~ l a . ~267 . ~ reflections with normalized structure factors (3) J. (4) J.
8, 1971
Karle and H. Hauptman, Acta Crystaflogr., 9, 635 (1956). Karle and I . L. Karle, ibid., 21, 869 (1966).
4567
E > 1.2 were used for the refinement. The R value was 0.24, where R = ZIIEo/ - lEo11/21EoI.A three-dimensional E map was then
Table In. Positional Parameters and Isotropic Thermal Parameters of the Hydrogen Atom&
computed and it revealed the complete molecular structure. Refinement of the Structure. VR. Atomic coordinates, individual layer scale factors, and isotropic temperature factors were refined with a block-diagonal-matrix least-squares program on an Elliot 803 computer to an R value of 0.1 1 (using a program written by G. A. Mair, Royal Institution, London,). The refinement of the heavy-atom parameters was continued on a CDC 3600 computer, using a full-matrix least-squares program (LAM, Trueblood, et d.). The total number of parameters in an anisotropic refinement of the heavy atoms would be 189, plus 9 layer scale factors. The refinement was carried out in two blocks of 10 and 11 atoms each, since a maximum of 160 parameters only could be refined simultaneously with the available computer program. The refinement dropped the R value to 0.082. A three-dimensional difference Fourier synthesis computed at this stage gave the positions of 13 hydrogen atoms. Keeping the parameters of heavy atoms fixed, the paramefers of the hydrogen atoms with isotropic temperature factors of 5 A 2 were refined to an R value of 0.074. The positions of H(0-3'), H(0-7), and H(N-3) were located from a difference Fourier map subsequently. Further refinement of all the hydrogen atoms gave a final R value of 0.070. The function minimized in the refinement was Zw(F, - Fc)2, weighting function wemployed being l/(a blFoI clFoI dlFoI a),s where a = 2Fm,,, b = 1, c = 2/Fmax,d = 5/Fmax2. For hydrogen atoms the scattering factors of Stewart, Davidson, and Simpson6 were used. For the other atoms scattering factors were computed using a function developed by Cromer and Waber.' LS. The refinement was carried through on a UNIVAC 1108 computer using the full-matrix least-squares program of Busing, Martin, and Levy.* The Evanse weighting scheme was used. Two cycles of isotropic least-squares refinement followed by two anisotropic cycles for the nonhydrogen atoms reduced the reliability index, R, from 0.26 to 0.064, where R = ZIIFo/ - / F c ~ ~ / Z ~Fo F ois~ ; the observed and Fc is the calculated structure factor. At this stage the positional parameters of the hydrogen atoms were obtained from a difference Fourier map. The structure was subjected to three further least-squares cycles with anisotropic temperature factors for nonhydrogen atoms and isotropic temperature factors for hydrogen atoms. The R value for the 1197 observed reflections was 0.035 and the final shift, u, was less than 0.11 for all nonhydrogen parameters and less than 0.40 for all hydrogen parameters. The data were finally corrected for secondary extinction according to Zacharisenlo and the structure was refined by one least-squares cycle with anisotropic temperature factors for nonhydrogen atoms and isotropic temperature factors for hydrogen atoms. The R value was reduced to 0.023. The changes in bond distances and bond angles due to secondary extinction corrections were less than 0 . 6 7 ~ . The scattering factors for P, 0, N, and C atoms are from Cromer and Waber7 and for H atoms from Stewart, Davidson, and Simpson.6
+
+
+
Results and Discussion The final positional and thermal parameters for nonhydrogen atoms and hydrogen atoms are given in Tables I1 and 111, respectively." The bond distances and bond angles not involving hydrogen atoms are given in Table IV. All distances and angles involving hydrogen atoms are normal and they are not tabulated. (5) D. W. J. Cruickshank, D. E. Pilling, A. Bujosa, F. M . Lovell, and M. R . Truter, "Computing Methods and the Phase Problem in X-Ray Crystal Analysis," Pergamon Press, New York, N. Y., 1961, p 32. (6) R . F. Stewart, E. R . Davidson, and W. T. Simpson, J . Chem. Phys., 42, 3175 (1965). (7) D. T. Cromer and J. T. Waber, A c f a Crysfullogr., 18, 104 (1965). (8) W. R. Busing, K. 0. Martin, and H. A. Levy, "ORFLS,A Fortran Crystallographic Least-Squares Program," ORNL-TM-~O~, Oak Ridge National Laboratory, Oak Ridge, Tenn., 1962. (9) H. T. Evans, Jr., Acta Crystallogr., 14, 689 (1961). (10) W. H. Zacharisen, ibid., 16, 1139 (1963). (1 1) The observed and calculated structure amplitudes for the two determinations have been deposited as Document No. NAPS-01370 with the ASIS National Auxiliary Publication Service, c/o CCM Information Corp., 909 3rd Ave., New York, N . Y. 10022. A copy may be secured by citing the document number and by remitting $2.00 for microfiche or $5.00 for photocopies. Advance payment is required. Make checks or money orders payable to: CCMIC-NAPS.
Atom
X
Y
450 (15) 201 (8) 451 (4) 204 (2) 179 (14) 325 (8) 186 (4) 320 (2) -58 (16) 289 (8) -57 (5) 284 (3) -129 (14) 132 (8) -116 (4) 117 (2) 24 (14) - 62 (8) -44 (2) 20 (4) - 192 (7) 414 (14) 403 (4) - 183 (2) 703 (14) - 136 (7) 674 (4) - 147 (2) 559 (14) - 18 (8) -44 (2) 543 (4) 570 (14) - 177 (8) 541 (4) - 173 (2) 219 (15) - 286 (8) 207 (4) -288 (2) 468 (14) - 387 (8) 471 (5) - 379 (3) 1 ~ 4 ) -119 (8) - 117 (3) -9 ( 5 ) 191 (13) - 126 (7) 186 (5) - 127 (2) 229 (15) 152 (7) 152 (3) 228 (6) 244 (17) -484 (8) -526 (2) 296 (4) - 503 (7) 298 (14) -463 (3) 300 ( 5 )
Z
B, As
- 106 ( 5 ) - 108 (1) - 143 (6)
3.0(0.5)
- 142 (2) - 120 (6) - 125 (2) -28 ( 5 ) -39 (1) 23 ( 5 ) 19 (1) - 24 ( 5 ) -22 (1) 76 ( 5 ) 60 (1) 105 ( 5 ) 103 (1) 210 ( 5 ) 209 (1) 160 (4) 164 (1) 145 ( 5 ) 170 (2) 184 ( 5 ) 186 (2) 250 (5) 253 (2) 301 ( 5 ) 309 (2) 169 (6) 172 (2) 233 (6) 241 (2)
3.9 (0.6) 4 . 5 (0.7) 3.1 (0.5) 2.7 (0.6) 3.0 (0.5) 3.1 (0.5) 3.1 (0.5) 4.1 (0.6) 4.1 (0.6) 6 . 6 (0.8) 5.0 (0.7) 5 . 6 (0.8) 7.3(0.8) 4 . 2 (0.6) 6.2(0.8)
a Positional parameters and their standpd deviations in parentheses have been multiplied by lo3; B = 5 A2for all hydrogen atoms in VR. * For each atom the parameters from VR and LS are given in the first line and second line, respectively.
In general the bond distances and bond angles show good agreement in the two investigations. However, the standard deviations in b y d distances and bond angles from LS (a(Z), 0.003 A, u(O),, 0.2") are abovt three times smaller than those from VR (u(Z),, 0.01 A, a(O),, 0.6"). In the following discussion the results of VR are given in parentheses alongside the results of LS. Molecular Conformation and Geometry. Glycosyl Bond. An important stereochemical parameter in nucleosides and nucleotides is the glycosyl torsion angle xCNwhich describes the relative orientation of the base with respect to the s ~ g a r . ' ~ " ~The ~ , ' ~angle (as defined by Sundaralingaml3) is -5.9' (-6.1") in 5 '-dCMP. Therefore, the conformation about the glycosyl bond is anti'2ar'3 and is the first case with a small negative torsion angle. The distribution of the glycosyl torsion angles in the known @-pyrimidine glycosides is given as a function of the furanoside ring pucker in Figure 2. It is seen that, with the exception of the present investigation, the xCNvalues in the anti conformation lie between 0 and 70". Moreover, for the C(3')endopuckering 0" 5 x 5 42", whereas for the C(2')e,d0 puckering 36" I x 5 65". As a consequence of the small xCNvalue, the steric interaction between the base and sugar increases; therefore, the glycosyl C( 1 ')-N( 1) bond distance, (12) (a) J. Donohue and K . N. Trueblood,J. Mol. Biol., 2,363 (1960); (b) M. Sundaralingam and L. H. Jensen, ibid., 13, 914 (1965). (13) M. Sundaralingam, Biopolymers, 7, 821 (1969).
Sundaralingam, et al. 1 Deoxycytidine 5 '-Phosphate Monohydrate
4568 Table IV. Bond Distances and Bond Angles Not Involving Hydrogen Atomsa
Figure 2. Distribution of the glycosyl torsion angles in the known &pyrimidine glycosides. The numbers indicated in the sectors of 10”represent numbers of known compounds. For the C(3’),,do conformation of a sugar x is between 0 and 42”, while for the C(2’)endo conformation x is between 36 and 65” (see also A. E. V. Haschemeyer and A. Rich, J . Mol. Biol., 27, 369 (1967)). The subtle differences in the furanoside ring conformation of 5’-dCMP and thymidine (see text) may explain the differences in their x angles. 4Thiouridine (W. Saenger and K. H. Scheit, ibid., 50, 153 (1970)) is the only nucleoside with a syn x angle and the sugar has the C(3 ’)endo puckering. The definition of the glycosyl torsion angle, X, as used throughout this paper is that given in ref 13. Note that the angles 180-360” in ref 13 are referred to as - 180 to 0” here; angles 0-180” have the same meaning.
1.510 k (1.519 k),is the longest observed so far. The correlation between the x angle and the C(1’)-N bond distance has already been noted in previous work. 14, l 5 The Deoxyribose. Sugar rings in nucleosides and nucleotides are usually puckered with either C(2’) or C(3’) out of the plane formed by the other four atoms.16,17 In 5’-dCMP the best four-atom plane in the deoxyribose ring is defined by C(l’), C(2’), C(4’), and O(1’) (Table V, plane 11). C(3‘) is displaced lying on the opposite side of the sugar plane 0.477 to C(5’). The conformation of the deoxyribose ring is therefore C(3’)ex0. The only other known pyrimidine nucleoside which is C(3’)ex0 is thymidine. The only purine nucleoside in the C(3’)ex0conformation is deoxyadenosine, l9 which was the first case with this conformation. It is noteworthy that all of the three nucleosides with the C(3 ’)exo conformation possess a deoxyribose. This observation suggests that the 3’ exo conformation may be energetically more favored for the deoxyribonucleic acids than the ribonucleic acids. It also has for the first time provided information that the absence of the 2’-hydroxy group can give the 2’-deoxyribose a conformation, C(3’)ex0, which is probably more favorable than for the ribose. When referred to the three-atom plane C( 1’), 0(1’), and C(4’) (Table V, plane 111), the conformation of 5’-dCMP is (C(3’)ex0-C(2’)ex0)which is different from those of thymidine (C(3 ’)exo-C(2’)endo) and deoxy-
A,
(14) M. Sundaralingam, Acta Crystallogr., 21, 495 (1966), and unaublished results. ~~~~.~ .~ (15) G. H.-Y. Lin, M. Sundaralingam, and S. I