Electron Localization in DNA - American Chemical Society

In particular, the human chromosome 22 and λ-DNA, both with approximately equal concentrations of the four bases, show insulating behavior. The condu...
2 downloads 0 Views 56KB Size
NANO LETTERS

Electron Localization in DNA

2003 Vol. 3, No. 10 1417-1420

Mikael Unge* and Sven Stafstro1 m† Department of Physics and Measurement Technology, IFM, Linko¨ping UniVersity, S-581 83 Linko¨ping, Sweden Received August 8, 2003

ABSTRACT Electron localization lengths in different DNA sequences have been calculated numerically using the transfer matrix method. It is shown that electronic states associated with guanine can reach fairly long localization lengths in disordered guanine-cytosine double strands if both intra- and interstrand π−π interactions are considered. For DNA sequences containing all four bases, the electronic states become localized to very few base pairs. In particular, the human chromosome 22 and λ-DNA, both with approximately equal concentrations of the four bases, show insulating behavior.

The conducting properties of DNA have been investigated and debated intensively during the last 4 to 5 years. Electron conduction is of course of interest for physicists, in particular in the context of nanotechnology. But there are also biological interests related to the electron-conducting properties of DNA, in particular for understanding the repair process of oxidatively damaged DNA.1 In the context of molecular electronics, DNA has shown properties that might make it suitable to use as a conducting wire or as a building block in the construction of biomolecular logical devices.2,3 However, there is no unified description of the conducting properties of DNA. Some reports on the subject classify DNA as a conductor,4-7 but others state that DNA is an insulator.8,9 To a large extent, the different outcomes of the studies are due to the fact that the experiments have been performed on different types of DNA. Clearly, there is a large difference between synthetic poly(G)-poly(C) DNA, which is a relatively highly ordered system in which one strand contains guanine only and the other strand contains cytosine only5,10 and the more or less random sequences of all four base molecules as in λ-DNA.8,9 Furthermore, the environment plays an important role. Experiments on charge transport have been performed on dry DNA5,6 as well as on DNA in solution4,7 and in vivo.11 Recently, a theoretical study was presented for DNA trapped inside a carbon nanotube,12 yet another environment for the DNA molecule. As the results from these investigations show, the environment in which the DNA molecule exists has a major effect on the result of the conductivity measurements/calculations.13 The double-helix structure of DNA results in an essentially 1D system. The electronic properties of this system as a whole result from interactions of the π orbitals along this * Corresponding author. E-mail: [email protected]. † E-mail: [email protected]. 10.1021/nl034631d CCC: $25.00 Published on Web 09/12/2003

© 2003 American Chemical Society

system. Similar to most organic electroactive materials, the majority of charge carriers in DNA are holes.14,15 Therefore, by considering only the highest occupied molecular orbitals (HOMO) of each base in the DNA molecule, we get a physically relevant but less complex system. Note, however, that this system treats only the static electron conduction mechanism. Effects from electron phonon coupling13 or from ionic motions are not included. Nevertheless, it is important to understand the effects of different types of interactions separately before discussing the conducting properties of DNA in any of the environments mentioned above. On the basis of the simple model for DNA outlined above, we calculate the localization length of electronic states around the Fermi energy. For a periodic and perfectly ordered 1D system, the eigenstates are extended.16 The introduction of uncorrelated disorder leads to the localization of all states.17 Very recently, 1D systems with correlated disorder were investigated theoretically on the basis of the Anderson type of model.18 It was concluded that under specific conditions such systems can in fact contain extended states. In this case, the disorder is not completely random but has some underlying structure that also affects the eigenstates of the system. The correlations that exist in the sequence of base pairs in DNA19 might apply to such a model and thus lead to electron delocalization in DNA.18 In contrast to these findings, we argue here that the disorder in DNA is so strong that even though such correlations exist they are of no importance to the electronic properties of the system (i.e., DNA lacks the existence of extended electronic states). The localization lengths depend on the strength of the disorder in relation to the interaction strength. In DNA, there is a significant overlap between the molecular orbitals along the stacking direction of each individual strand.20 The π interaction between the two base molecules within a pair is, however, very weak and can be neglected.10 In addition to

these two directions of intermolecular interactions, there are also nonnegligible interstrand interactions that occur as a result of the overlap between base molecules on adjacent steps in the helix.21 This implies that each strand should not be considered to be an independent molecular wire; together, the two strands form the DNA conductor. The tight-binding Hamiltonian for such a system can be written as 2

H)

j,j′ (∑(ji|j, i〉〈j′, i|δj,j′ + t i,i+1 |j, i〉〈j′, i + 1| + ∑ j,j′)1 i j′,j |j′, i + 1〉〈j, i|)) (1) t i+1,i

ji

where is the on-site energy (in this case, the eigenenergy of the HOMO) of base molecule i on each of the two strands j,j is the intermolecular hopping interaction (j ) 1, 2) and ti,i+1 between neighboring base molecule within the strands. The j,j′ (j * j′) describes the interstrand interactions term ti,i+1 between the π orbital of base i in strand j and the π orbital of base i + 1 in strand j′. The numerical values taken for the on-site energies were obtained from the HOMO energies of guanine (G), adenine (A), cytosine (C), and thymine (T) as calculated from ab initio density functional theory (DFT) on each of the individual base molecules. The calculations were performed with a double-ζ basis set with polarization functions for the valence orbitals. The values are G ) 0.0 eV (reference), A ) -0.33 eV, C ) -0.54 eV, and T ) -0.91 eV. The values for intermolecular hopping are taken from refs 20 and 21. Intrastrand hopping ranges from 0.029 to 0.158 eV, and interstrand hopping ranges from 0.0007 to 0.062 eV. As a result of these different energy values, the Hamiltonian describing the electronic structure of DNA contains disorder in both the on-site and the off-diagonal terms. Note that the differences in the on-site energies are large compared to the hopping strength, which indicates that delocalization will be strongly hindered in a random sequence of these base pairs. The degree of hybridization of the π orbitals of the basepair stack for the sugar and phosphate groups forming the backbone of the double strand has also been discussed.22 We have calcualted the molecular orbitals from ab initio DFT of the combined system of the base molecule and the neighboring sugar and phosphate groups. For guanine and adenine, the hybridization is practically nonexistant, and the HOMO energies are very close to those obtained for the base molecule itself. This can easily be understood from the shape of the HOMO, which for these two molecules has a node close to the nitrogen atom that connects the base molecule to the sugar moiety. Because of the absence of coupling between the guanine HOMO and the backbone, the guanine band is electronically unaffected by the backbone. Because most charge carriers are believed to originate from this band, we conclude that our model is sufficient to decribe the static scattering mechanism of the electronic charge carriers in DNA. In the case of cytosine and thymine, the hybridization is stronger, and there is a shift in the HOMO energies toward higher energies. Again, the explanation for this behavior is 1418

obvious from the shape of the HOMO; cytosine and thymine have large amplitudes at the nitrogen site attached to the sugar. As a result of this hybridization, the HOMO energy values are now ′G ) 0.0 eV (reference), ′A ) -0.31 eV, ′C ) -0.42 eV, and ′T ) -0.77 eV. In comparison with the corresponding values presented above, this gives on-site energies that are closer to each other. The effect of this difference on the localization length will be commented on below. The secular equation for the double-stranded DNA system is Ti†- 1ψi - 1 + (Hi - EI)ψi + Tiψi+1 ) 0

(2)

where Ti is a 2 × 2 matrix describing the intermolecular hopping interaction between base pair i and base pair i + 1 and Hi is the (diagonal) Hamiltonian matrix containing elements 1,2 i . The coefficients of the wave function in base pair i at energy E are described by ψi. From eq 2, we define the transfer matrix as

(

-1 † τi(E) ) T-1 i (EI - Hi) - Ti Ti-1 0 I

)

(3)

The evolution of the wave functions along the 1D system is described by the product of the transfer matrices. This product contains information about the exponential decay of the wave functions in the (disordered) system. We use the transfer matrix method to calculate the localization length i DNA. The localization length, λ, is defined as23 λ)

1 min(γi)

(4)

where γi is the Lyapunov characteristic exponent that we calculated numerically by successive multiplications of transfer matrices: 4

∑ i)1

γi(E) ) lim

nf∞

ln|τnτn - 1‚‚‚τ1u| n

(5)

where u in the case of the double-stranded system is a matrix composed of four linearly independent vectors of dimension 4.24 The resulting values of the localization lengths all depend on the sequence of the base pairs. We calculate here the localization properties of computer-generated G-C and T-A sequences at different base pair concentrations both for completely random sequences and for different short-rangecorrelated distributions. The purpose of introducing short-range correlation is to study the effect of increasing sequences of identical bases. For the values of the on-site energies considered here, this is the only type of correlation that can affect the localization length (see below). The correlated sequence starts with a Nano Lett., Vol. 3, No. 10, 2003

Figure 1. Localization lengths (in units of the base-pair separation distance) as a function of energy for G-C DNA with (- - -) and without (s) interstrand coupling.

Figure 2. Localization lengths (in units of the base-pair separation distance) as a function of energy for G-C DNA for a completely random rectangular distribution (s) and with different σ values: σ ) 1 (- - -), σ ) 8 (‚ ‚ ‚).

completely random sequence of base pairs. For each base pair, we generate a number that controls how many identical pairs will appear in a row. The length of this subsequence is generated from a normal distribution with standard deviation σ and expectation value 1. Only numbers larger than or equal to 1 in this distribution are used. For comparison, we have also calculated the localization length of different known DNA sequences (e.g., the human chromosome 22 (NT_011520) and λ-DNA (NC_001416)).25 First, we study the effect of interstrand hopping. The localization length of a disordered G-C sequence is calculated. The appearance of G and C along a single strand is completely random in this sequence. The result is shown in Figure 1. Without the interstrand interactions included, the electronic system consists of two uncoupled single strands with a random distribution of on-site energies G ) 0.0 eV and C ) -0.54 eV. Because the hopping interactions are small compared to this difference, the electronic states become very localized. With the interstrand interaction included, there is always the possibility of hopping from one G (or C) site to another. The localization length in the G and C bands is therefore increased significantly; the peak in the G band reaches λ ≈ 7 base pairs. The localization is in this case caused primarily by disorder in the hopping j,j interactions; the intrastrand G-G hopping energy is t i,i+1 ) 0.084 eV, and the two interstrand G-G hopping energies are 1,2 2,1 ) 0.019 eV and t i,i+1 ) 0.043 eV.20,21 These values t i,i+1 are considerably larger than the corresponding C-C hopping energies, which explains the shorter localization lengths in the C band. As discussed above, there is also a small but nonzero hybridization of the C molecule with the neighboring sugar group. This hybridization shifts the energy of the HOMO level of C approximately 0.12 eV toward the HOMO level of G. However, only a very small increase in the localization length appears as a result of the fact that the G and C HOMO levels in this case appear to be less separated. The peak in the G band now reaches λ ≈ 8 base pairs instead of λ ≈ 7. For the results presented below, this difference is even smaller, and we will neglect this effect in the rest of this work.

Figure 2 shows the localization length for G-C DNA generated randomly but with an increased probability that a G(C) is followed by a G(C). We generate two types of DNA sequences with different standard deviations (σ ) 1 and 8, see above), where larger values of σ mean longer uniform sequences. The maximum localization length for σ ) 8 reaches approximately 50 base pairs. The average length of a uniform G-C sequence is in this case 14 base pairs. In these calculations, G and C appear with equal probability in each strand. If the concentration of G(C) in strand 1(2) is increased, then the amount of disorder is reduced and the localization length consequently increases. With σ ) 8 and 80% G in strand 1, the maximal localization length is approximately 70 base pairs. The average length of a uniform G-C sequence is in this case 35 base pairs. As shown in Figure 2, the localization length decreases rapidly with increasing distance from the center of the G band. Thus, most charge carriers in this band are very localized. This result points to the fact that electron localization will occur even for synthetic structures with quite long sequences of uniform stacking. This is in agreement with experimental observations of bandlike transport through a maximum of 30-60 base pairs of poly(G)-poly(C) DNA.5,7 Calculations of the localization lengths of A-T DNA do not give localization lengths that are as long as those in G-C DNA. With σ ) 8, the electronic states at the center of the A and T bands are extended over approximately 12 base pairs. The shorter length in these bands compared to that in the G band occurs because of the smaller intra- and interstrand hopping energies between two adenine and two thymine molecules, respectively. The localization lengths in a DNA sequence containing all four bases are considerably shorter than in G-C DNA because the disorder in on-site energies is increased. Figure 3 shows the maximum localization lengths in the G band for different concentrations of G-C pairs. In all of these calculations, the G-C (T-A) and C-G (A-T) ordering appear with equal probability. A G-C concentration of 0.5 (50%) corresponds to equal concentrations of G-C and T-A base pairs.

Nano Lett., Vol. 3, No. 10, 2003

1419

In conclusion, we have shown that for a DNA sequence containing all four bases the electronic state is localized to very few base molecules. In G-C DNA, the localization length can reach 50 base pairs if the G(C) base appears more in groups rather than completely randomly. We have also shown that even though human chromosome 22 has a longrange correlation of the distribution of bases, the localization length is still short because of the more detailed description of the interaction energies than that presented in ref 16.

Figure 3. Maximum G-band localization lengths (in units of the base-pair separation distance) for different G-C (T-A) concentrations for a completely random rectangular distribution (s) and with different standard deviations when generating the sequence: σ ) 1 (- - -), σ ) 8 (‚ ‚ ‚).

We note that the localization lengths are very short in this regime. Below 90% concentration of G-C pairs, the localization lengths are shorter than 5, and it is not until 95-100% that we see a significant increase in the localization length. DNA sequences containing all four bases will therefore act as insulators (e.g., λ-DNA (NC_001416), which has a G-C concentration of approximately 50%,25 has been reported to have insulating behavior9). Another example is the human chromosome 22 (NT_011520) that has a G-C concentration of approximately 47%.25 Our results of the localization lengths in the G band of this system do not even reach the separation length between two adjacent base pairs. This behavior is a natural consequence of the large variations in the individual on-site energies of the four bases and the different hopping energies between the different base combinations. Therefore, the long-range correlation that exists in human chromosome 22 has no effect on the electronic properties of this system, a result that is in disagreement with the results presented in ref 16. We conclude that this difference is due to the fact that we use values of the parametes based on ab initio results, whereas the parametes used in ref 16 are rather far from these values. As can be seen in Figure 3, the localization lengths for the completely random sequence exceed those of the correlated sequences below a G-C concentration of about 87%. In this case, correlation results in longer potential (A-T) barriers between the (short) uniform G-C sequences. In the completely random sequence, the barriers are in general shorter, which allows for an extension of the orbitals across such barriers. As the concentration increases, the A-T barriers in the correlated sequences become shorter, and the localization lengths in the G band become substantially longer than the average length of a uniform G-C sequence.

1420

Acknowledgment. We acknowledge financial support from the Swedish Research Council. The National Supercomputer Center (NSC) is gratefully acknowledge for allowing us to use their computer facilities. References (1) Dandliker, P. J.; Nu´n˜es, M. E.; Barton, J. K. Biochemistry 1998, 37, 6491. (2) Braun, E.; Eichen, Y.; Sivan, U.; Ben-Yoseph, G. Nature 1998, 391, 775. (3) Ben-Jacob, E.; Hermon, Z.; Caspi, S. Phys. Lett. A 1999, 263, 199. (4) Fink, H. W.; Scho¨nenberger, C. Nature 1999, 398, 407. (5) Porath, E.; Bezryadin, A.; Vries, S. D.; Dekker, C. Nature 2000, 403, 635. (6) Yoo, K. H.; Ha, D. H.; Lee, J. O.; Park, J. W.; Kim, J.; Kim, J. J.; Lee, H. Y.; Kawai, T.; Choi, H. Y. Phys. ReV. Lett. 2001, 87, 198102. (7) Hwang, J. S.; Kong, K. J.; Ahn, D.; Lee, G. S.; Ahn, D. J.; Hwang, S. W. Appl. Phys. Lett. 2002, 81, 1134. (8) de Pablo, P. J.; Moreno-Herrero, F.; Colchero, J.; Herrero, J. G.; Herrero, P.; Baro´, A. M.; Ordejo´n, P.; Soler, J. M.; Artacho, E. Phys. ReV. Lett. 2000, 85, 4992. (9) Zhang, Y.; Austin, R. H.; Kraeft, J.; Cox, E. C.; Ong, N. P. Phys. ReV. Lett. 2002, 89, 198102. (10) Hjort, M.; Stafstro¨m, S. Phys. ReV. Lett. 2001, 87, 228101. (11) Boudaı¨ffa, B.; Cloutier, P.; Hunting, D.; Huels, M. A.; Sanche, L. Science 2000, 287, 1658. (12) Gao, H.; Kong, Y.; Cui, D.; Ozkan, C. S. Nano Lett. 2003, 3, 471. (13) Basko, D. M.; Conwell, E. M. Phys. ReV. Lett. 2002, 88, 098102. (14) Lewis, F. D.; Liu, X.; Liu, J.; Miler, S. E.; Hayes, R. T.; Wsiewski, M. R. Nature 2000, 406, 51. (15) Giese, B.; Amaudrut, J.; Ko¨hler, A.-K.; Spormann, M.; Wessely, S. Nature 2001, 412, 318. (16) Bloch, F. Z. Physik 1928, 52, 555. (17) Anderson, P. W. Phys. ReV. 1958, 109, 1492. (18) Carpena, P.; Bernaola-Galva´n, P.; Ivanov, P. C.; Stanley, H. E. Nature 2002, 418, 955. (19) Holste, E.; Grosse, I.; Herzel, H. Phys. ReV. E 2001, 64, 041917. (20) Voityuk, A. A.; Ro¨sch, N.; Bixon, M.; Jortner, J. J. Phys. Chem. B 2000, 104, 9740. (21) Voityuk, A. A.; Bixon, M.; Ro¨sch, N. J. Chem. Phys. 2001, 114, 5614. (22) Cuniberti, G.; Craco, L.; Porath, D.; Dekker, C. Phys. ReV. B 2002, 65, 241314(R). (23) Pichard, J. L.; Sarma, G. J. Phys. C 1981, 14, 127. (24) Benettin, G.; Galgani, L. In Intrinsic Stochasticity in Plasmas; Laval, G., Gre´sillon, D., Eds.; Editions de Physique: Orsay, France, 1979; p 93. (25) National Center for Biotechnology Information. http://www.ncbi.nlm. nih.gov/.

NL034631D

Nano Lett., Vol. 3, No. 10, 2003