MM Studies on Excited-State Relaxation Mechanisms of

E-mail: [email protected]; [email protected]. Abstract. Semisynthetic alphabet can potentially increase the genetic information stored in DNA ...
0 downloads 0 Views 5MB Size
Subscriber access provided by READING UNIV

Article

QM and QM/MM Studies on Excited-State Relaxation Mechanisms of Unnatural Bases in Vacuo and Base Pair in DNA Qian Wang, Xiao-Ying Xie, Juan Han, and Ganglong Cui J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.7b09046 • Publication Date (Web): 30 Oct 2017 Downloaded from http://pubs.acs.org on November 13, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

QM and QM/MM Studies on Excited-State Relaxation Mechanisms of Unnatural Bases in Vacuo and Base Pair in DNA Qian Wang, Xiao-Ying Xie, Juan Han,∗ and Ganglong Cui∗ Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China E-mail: [email protected]; [email protected]

Abstract Semisynthetic alphabet can potentially increase the genetic information stored in DNA through the formation of unusual base pairs such as d5SICS:dNaM. However, recent experiments show that near-visible-light irradiation on the d5SICS and dNaM chromophores could lead to genetic mutations and damages. Until now, their photophysical mechanisms remain elusive. Herein we have employed MS-CASPT2//CASSCF and QM(MS-CASPT2//CASSCF)/MM methods to explore the spectroscopic properties and excited-state relaxation mechanisms of d5SICS and dNaM, and d5SICS:dNaM in DNA. We have found that (1) the S2 state of d5SICS, the S1 state of dNaM, and the S2 state of d5SICS:dNaM are initially populated upon near-visible light irradiation; (2) for d5SICS and d5SICS:dNaM, there are several parallel relaxation pathways to populate the lowest triplet state; but, for dNaM, a main relaxation pathway is uncovered. Moreover, we have found that the excited-state relaxation mechanism of d5SICS:dNaM in DNA is similar to that of the isolated d5SICS chromophore. These mechanistic insights contribute to the understanding of photophysics and photochemistry of unusual base pairs and to the design of better semisynthetic genetic alphabet. ∗ To

whom correspondence should be addressed

1 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction The natural genetic alphabet is composed of four letters, which are usually referred to as G, C, A, and T. Their selective pairings to form two base pairs i.e. A:T and C:G through complementary hydrogenbond formation underlies the storage and retrieval of virtually all biological information. This alphabet is essentially conserved throughout nature. 1 To expand the alphabet, significant effort has been dedicated to develop unnatural base pairs (UBPs) between two synthetic nucleotides because semisynthetic organisms that stably harbor these UBPs in DNA could store and potentially retrieve the increased information. 2–12 In the past decades, Romesberg and coworkers have developed a family of predominantly hydrophobic UBPs, among which the UBP between dNaM and d5SICS being a particularly promising example. 13–19 This dNaM-d5SICS UBP has been demonstrated to be well replicated by a variety of DNA polymerases in vitro. Very recently, they have used genetic and chemical approaches to optimize different components of semisynthetic organisms eventually making them grow robustly and is capable of virtually unrestricted storage of increased information. 20 However, it is not known whether these UBPs have enough high photostability as those natural nucleobases. 21–24 In 2016, Pollum et al. have first explored the photochemical properties of d5SICS or dNaM in aqueous and acetonitrile solutions. 25 They found that near-visible excitation of d5SICS and dNaM results into an efficient population of the lowest excited triplet state in high yield (d5SICS: ca. 0.94; dNaM: ca. 0.65 in acetonitrile). In addition, the photoactivation of these long-lived triplet states is shown to be able to photosensitize cells and generates reactive oxygen species, i.e. singlet oxygen, which can cause genetic mutation and DNA photodamage (quantum yields of singlet oxygen: d5SICS: ca. 0.42; dNaM: ca. 0.23 in acetonitrile). Recently, Ashwood et al. have employed steady-state absorption and emission spectroscopies combined with femtosecond broadband transient absorption spectroscopy to explore the excited-state relaxation pathways. 26 However, the detailed photophysical mechanisms of d5SICS and dNaM remain elusive. Computationally, very recently, Bhattacharyya and Datta have employed a series of electronic structure calculations and trajectory-based surface-hopping dynamics to explore the vertical excitation energies of d5SICS and dNaM and their excited-state decay pathways leading to the population of the 2 ACS Paragon Plus Environment

Page 2 of 29

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

lowest triplet state. 27 They proposed that both UBPs, i.e. d5SICS and dNaM, are first excited into the S2 state, followed by an S2 → S1 internal conversion. Then, an S1 → T2 intersystem crossing makes the system hop into the triplet-state manifolds. Finally, the internal conversion within the triplet-state space occurs, thereby leading to the formation of the lowest triplet state. It is worthwhile to note that all the calculations are performed in gas phase, without considering the solvent effects and DNA surroundings (except vertical excitation energies by TD-DFT/PCM model). To fill such gap, in this work, we have for the first time employed the quantum mechanics/molecular mechanics (QM/MM) method combined with the MS-CASPT2//CASSCF method to explore the photophysical mechanism of the unnatural d5SICS:dNaM base pairing in the realistic DNA surroundings. Spectroscopic properties, minima, intersection structures, and excited-state decay paths in and between S0 , T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) are systematically studied. Moreover, in order to explore the effects of the DNA surroundings on geometric and electronic structures, and excited-state decay pathways, we have also employed the same computational method to study the corresponding photophysics of both d5SICS and dNaM chromophores.

Figure 1: The unusual d5SICS (blue):dNaM (red) base pair incorporated in DNA. Also shown is the QM/MM partition used in the calculations. The QM region includes both d5SICS and dNaM chromophores.

3 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Computational Details System Setup The initial double-stranded DNA structure with the 5’-CACAATTCC-3’:3’-GTGTTAAGG-5’ sequence is constructed using the NAB module in AMBER2015 package. 28 The dNaM and d5SICS chromophores are then attached to the backbone by replacing the fifth A:T base pair using the XLEAP module. This modified DNA structure is solvated in a 12×12×12 Å cubic water box (4002 water molecules). Then, 16 sodium ions are added to neutralize the system (finally, 12605 atoms). Subsequently, the ions and water molecules in this modified DNA structure are first minimized; then, the entire system is minimized (2500 steps), heated (20 ps), and equilibrated (1 ns, NPT:300K). In MD simulations, the force-field parameters of d5SICS and dNaM are generated using the generalized Amber force field (GAFF); 29 those of nucleic acids are described with the Amber ff99 force field; 30 water molecules are described using the TIP3P model. 31 The Andersen thermostat 32 and Berendsen barostat techniques 33 are used to control temperature and pressure and periodic boundary conditions are used. A cutoff of 10 Å is employed to truncate the nonbonding interactions. The final snapshot from the MD simulations is chosen as the starting structure for the following QM/MM electronic structure calculations.

QM and QM/MM Methods All electronic structure calculations for the d5SICS:dNaM base pair in the DNA surroundings are carried out at the QM/MM level. 34–37 The QM subsystem includes the d5SICS and dNaM chromophores (21 and 22 atoms including the linking atoms, respectively); while the MM subsystem consists of all the remaining DNA structure and all ions and water molecules (12562 atoms; see Fig. 1). The QM region is treated using the complete-active-space self-consistent field (CASSCF) method and the multi-state complete-active-space second-order perturbation (MS-CASPT2) approach; 38,39 the MM subsystem is treated using the Amber ff99 force field (DNA) 30 and the TIP3P model (waters). 31 The QM-MM boundary is treated by the hydrogen link-atom scheme. 35 The electrostatic embedding scheme is adopted in our 4 ACS Paragon Plus Environment

Page 4 of 29

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

QM/MM calculations. 40 In QM/MM calculations, the atoms within 5 Å from any atoms of the fourth, fifth, and sixth base pairs of the DNA structure are allowed to move (1019 atoms). The other atoms are frozen after the 1 ns MD simulations. In the CASSCF calculations for the isolated d5SICS and dNaM chromophores, an active space of 10 electrons in 8 orbitals is used; while, in the QM(CASSCF)/MM calculations for the d5SICS:dNaM base pair in DNA, a larger active space of 12 electrons in 10 orbitals is used (see Fig. S1). In all MS-CASPT2 and QM(MS-CASPT2)/MM calculations, 38,39 a same active space of 14 electrons in 12 orbitals is used (see Fig. S1); the Cholesky decomposition technique with unbiased auxiliary basis sets for accurate two-electron integral approximations is used; 41 the imaginary shift technique (0.2 a.u.) is used to avoid the intruder-state issue; 42 the ionization potential electron affinity shift is set to zero. 43 At the same computational level, spin-orbit coupling matrix elements are calculated with the atomic meanfield approximation, 44,45 based on which the effective spin-orbit coupling < ΨI |HeSO f f |ΨJ >, reported in the text, is calculated

√ < ΨI |HeSO f f |ΨJ

>=

(< ΨI |HxSO |ΨJ >)2 + (< ΨI |HySO |ΨJ >)2 + (< ΨI |HzSO |ΨJ >)2 3

(1)

in which ΨI and ΨJ are electronic wavefunctions of involved two electronic states and HxSO , HySO , and HzSO are x, y, and z components of the spin-orbit operator. The 6-31G* basis set is used throughout all calculations in this work. 46,47 All QM and QM/MM calculations are carried out using MOLCAS8.0 package 48–51 that interfaces with TINKER6.3.2 package. 52

Results and Discussion d5SICS Vertical excitation energies provide important information for the understanding of electronic structure characters in the Franck-Condon region; thus, we have first explored these excited-state properties of

5 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: CASSCF optimized minimum-energy structures of the isolated d5SICS chromophore in the S0 , T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) states. Also shown are selected bond lengths (in Å). The chosen atom numbering scheme is indicated.

Figure 3: MS-CASPT2 computed molecular orbitals relevant to the vertical electronic excitations to the (top) S1 (1 nπ ∗ ) and (bottom) S2 (1 ππ ∗ ) states of the isolated d5SICS chromophore (left); to the S1 (1 ππ ∗ ) state of the isolated dNaM chromophore (right). Also shown are the weights of the main electronic configurations involved.

6 ACS Paragon Plus Environment

Page 6 of 29

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

d5SICS at its optimized S0 structure (see Fig. 2). It can be found that the lowest four electronically excited singlet and triplet states are T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ). The vertical excitation energies to these four excited states at the Franck-Condon point are computed to be 59.1, 69.2, 70.6, and 76.5 kcal/mol at the MS-CASPT2//CASSCF level, respectively. The corresponding oscillator strengths to the S1 (1 nπ ∗ ) and S2 (1 ππ ∗ ) states are also computed to be 0.00004 and 0.2569, respectively, which illustrate that the former state is a very much weak state; whereas, the latter one is a spectroscopically bright electronically excited singlet state. Electronic structure analysis shows that the S1 (1 nπ ∗ ) state mainly originates from the electronic configuration of HOMO-1 to LUMO with a weight of 0.70, in which HOMO-1 corresponds to the lonepair molecular orbital of the S atom and LUMO is a delocalized anti-bonding π ∗ molecular orbital; the S2 (1 ππ ∗ ) state stems from the electronic configuration of HOMO to LUMO with a weight of 0.64, in which HOMO corresponds to the π molecular orbital of the S atom (see the panel a of Fig. 3). Experimentally, the d5SICS chromophore has a maximum peak at 365 and 371 nm [78.3 and 77.1 kcal/mol] in aqueous and acetonitrile solution, respectively, 22 which are close to our MS-CASPT2 computed S0 → S2 vertical excitation energy of 76.5 kcal/mol at the Franck-Condon point in gas phase. In comparison, previous MS-CASPT2 and TD-DFT/PCM calculations give a little higher S0 → S2 vertical excitation energies, which are about 97.5 and 84.4 kcal/mol, respectively. 26,27 The minimum-energy structures of d5SICS in the lowest four electronically excited states i.e. T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) have been optimized at the CASSCF level without any geometry restraints. All these optimized structures are essentially planar (see Fig. 2). Fig. 4 shows the bond-length differences of these T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) minima relative to the counterparts of the S0 minimum. In the S2 (1 ππ ∗ ) minimum, the S1-C2 and C2-C7 [C4-C5, C5-C6, C6-C7, and C9-C10] bond lengths change more than 0.1 [0.05] Å, among which the largest one is from the S1-C2 bond (ca. 0.13 Å). In comparison, the structural changes of the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), and S1 (1 nπ ∗ ) minima relative to those of the S0 minimum are similar to each other, which is however different from those of the S2 (1 ππ ∗ ) minimum. Specifically, the S1-C2, C2-C7, C10-C11, and C6-C11 bond lengths vary remarkably; however, only the S1-C2 bond length is increased more than 0.1 Å (see Fig.

7 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: CASSCF computed bond-length differences (in Å) of the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) minima of the isolated d5SICS chromophore relative to the counterparts of the S0 minimum.

8 ACS Paragon Plus Environment

Page 8 of 29

Page 9 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

4). The potential energies of the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) minima relative to that of the S0 minimum are calculated to be 54.5, 60.6, 63.3, and 70.9 kcal/mol at MS-CASPT2//CASSCF level, respectively.

Figure 5: MS-CASPT2 computed linearly interpolated internal coordinate paths (in kcal/mol) connecting (A) the S2 (1 ππ ∗ ) and S1 (1 nπ ∗ ) minima, (B) the S1 (1 nπ ∗ ) and T1 (3 ππ ∗ ) minima, (C) the S2 (1 ππ ∗ ) and T2 (3 nπ ∗ ) minima, and (D) T2 (3 nπ ∗ ) and T1 (3 ππ ∗ ) minima. To explain experimentally observed high quantum yield of triplet states of d5SICS, 25 we have explored the possible nonadiabatic excited-state decay paths from the initially populated S2 (1 ππ ∗ ) to the T1 (3 ππ ∗ ) state. It can be found that at the S2 (1 ππ ∗ ) minimum, the S2 (1 ππ ∗ ), T2 (3 nπ ∗ ), and S1 (1 nπ ∗ ) states are close to each other, 70.9 vs. 69.1 vs. 69.7 kcal/mol at MS-CASPT2 level as shown in the panels a and c of Fig. 5. So, the internal conversion from the S2 (1 ππ ∗ ) to S1 (1 nπ ∗ ) states is efficient due to the very small energy gap; similarly, the intersystem crossing from the S2 (1 ππ ∗ ) to T2 (3 nπ ∗ ) states is also expeditious because of the very large S2 /T2 spin-orbit coupling (83.0 cm−1 9 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

at MS-CASPT2 level; see Table S5). Once arriving at the S1 (1 nπ ∗ ) state from the S2 (1 ππ ∗ ) state, the system can further hop to the lowest T1 (3 ππ ∗ ) state quickly through the S1 (1 nπ ∗ )→ T1 (3 ππ ∗ ) intersystem crossing because of the small S1 -T1 energy gap (see the panel b of Fig. 5) and the very large S1 /T1 spin-orbit coupling (90.1 cm−1 at MS-CASPT2 level; see Table S5). Alternatively, when the T2 (3 nπ ∗ ) state is populated from the S2 (1 ππ ∗ ) state, the system can further hop to the T1 (3 ππ ∗ ) state in a vibronically coupling means due to the small energy gap in an extended region (see the panel d of Fig. 5). In addition, there is some probability for the system hopping to the T2 (3 nπ ∗ ) state from the S1 (1 nπ ∗ ) state because both electronic states are close to each other in an extended region (ca. 2.0 kcal/mol; see the panel b of Fig. 5). Then, the T2 system decays to the T1 state followed by a T2 → T1 internal conversion process. Nonetheless, one should note that this intersystem crossing process should be inefficient due to the significantly small spin-orbit coupling because these two electronic states share similar electronic structure character (this complies with the classical EI-Sayed rule 53,54 ). Finally, these nonadiabatic excited-state decay pathways are mainly responsible for the efficient population of the T1 (3 ππ ∗ ) state.

dNaM Different from the situation of d5SICS, the lowest three electronically excited singlet and triplet states i.e. S1 , T1 , and T2 at the Franck-Condon point i.e. the S0 minimum are all of ππ ∗ character. At MSCASPT2//CASSCF level, the vertical excitation energies to these three excited states are calculated to be 67.9, 85.5, and 90.8 kcal/mol, respectively, which are much higher than those of d5SICS (see above). Moreover, one can find that the S1 (1 ππ ∗ ) state is also a spectroscopically bright state due to the comparable oscillator strength for the S0 → S1 electronic transition, 0.1 at MS-CASPT2 level. Further examination of electronic structures reveals that the S1 (1 ππ ∗ ) state stems from a combination of electronic configurations, HOMO-4 to LUMO (weight: 0.27) and HOMO-2 to LUMO+1 (weight: 0.25) as shown in the panel b of Fig. 3. Experimentally, the maximum peak of the dNaM chromophore is blue-shifted compared with that of the d5SICS one and is measured to be 325 nm [88.0 kcal/mol] in aqueous and acetonitrile solution. 25 10 ACS Paragon Plus Environment

Page 10 of 29

Page 11 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 6: CASSCF optimized minimum-energy structures of the isolated dNaM chromophore in the S0 , S1 (1 ππ ∗ ), T2 (3 ππ ∗ ), and T1 (3 ππ ∗ ) states. Also shown are selected bond lengths (in Å). The chosen atom numbering scheme is indicated.

11 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7: CASSCF computed bond-length differences (in Å) of the T1 (3 ππ ∗ ), T2 (3 ππ ∗ ), and S1 (1 ππ ∗ ) minima of the isolated dNaM chromophore relative to the counterparts of the S0 minimum. On the basis of our present results, this absorption peak is assigned to the S0 → S1 vertical excitation, whose energy is estimated to be 90.8 kcal/mol at MS-CASPT2 level. The corresponding S0 → S1 vertical excitation energy is computed to be 98.9 and 97.5 kcal/mol in previous MS-CASPT2 and TD-DFT calculations, respectively. 26,27 In addition to the S0 minimum, the minimum-energy structures in the S1 , T1 , and T2 states have been optimized at CASSCF level and collected in Fig. 6. Compared with those of the S0 minimum, in the S1 (1 ππ ∗ ) minimum, the C6-C7 bond length changes most significantly, increasing from 1.416 to 1.495 Å; the C4-C5 and C11-C12 bond lengths also vary considerably (see Fig. 7). In the T2 (3 ππ ∗ ) minimum, the C4-C5, C5-C6, and C6-C7 bond lengths have remarkable changes (e.g. 0.059 Å for C4-C5; 0.069 Å for C5-C6; see Fig. 7). By contrast, in the T1 (3 ππ ∗ ) minimum, the C3-C4, C4-C5, C8-C3, C9-C10, C10-C11, and C11-C12 bond lengths change more than 0.05 Å. Energetically, the potential energies of the T1 , T2 , and S1 minima relative to that of the S0 minimum is calculated to be 59.0, 83.0, and 86.3 kcal/mol at MS-CASPT2 level, respectively. 12 ACS Paragon Plus Environment

Page 12 of 29

Page 13 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 8: MS-CASPT2 computed two linearly interpolated internal coordinate paths (in kcal/mol) of the isolated dNaM chromophore connecting (A) the S1 (1 ππ ∗ ) and T2 (3 ππ ∗ ) minima, and (B) the T2 (3 ππ ∗ ) and T1 (3 ππ ∗ ) minima. The photophysics of dNaM upon irradiation to the S1 state is very different from that of d5SICS. In the S1 state, the dNaM system will first hop to the T2 state because of the small S1 -T2 energy gap in an extended region (ca. 4.0 kcal/mol at MS-CASPT2 level; see the LIIC path connecting both the S1 and T2 minima in the panel a of Fig. 8). However, this S1 → T2 intersystem crossing could be not so efficient due to the very small S1 /T2 spin-orbit coupling at the S1 (1 ππ ∗ ) minimum. This small magnitude coincides with the EI-Sayed rule, which states that spin-orbit couplings between similar electronic states are small. 53,54 After arriving at the T2 state, the internal conversion from the T2 to T1 state should also be inefficient due to the much large energy gap. As indicated by the linearly interpolated internal coordinate (LIIC) path connecting both T2 (3 ππ ∗ ) and T1 (3 ππ ∗ ) minima in the panel b of Fig. 8, the smallest energy gap along this path is computed at MS-CASPT2 level to be 15.3 kcal/mol at the T2 (3 ππ ∗ ) minimum. Therefore, it could be safe to expect that the population of the T1 state in dNaM is not as much efficient as that in d5SICS (however, the population of the T2 state should be considerable due to the small energy between S1 and T2 ). This is also consistent with recent experiments 26 in which a very high quantum yield of fluorescence emission, ca. 0.46 [0.09], is observed for dNaM in acetonitrile [PBS] solution; in stark contrast, the fluorescence quantum yield of d5SICS in both solutions is estimated to be around 10−5 .

13 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 9: MS-CASPT2 computed molecular orbitals relevant to the vertical electronic excitations to the (top) S1 (1 nπ ∗ ) and (bottom) S2 (1 ππ ∗ ) states of the d5SICS:dNaM base pair in DNA. Also shown are the weights of the main electronic configurations involved.

Figure 10: CASSCF optimized minimum-energy structures of the d5SICS:dNaM base pair in DNA in the S0 , T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) states. Also shown are selected bond lengths (in Å). The chosen atom numbering scheme is indicated. 14 ACS Paragon Plus Environment

Page 14 of 29

Page 15 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

d5SICS:dNaM in DNA In addition to the isolated d5SICS and dNaM chromophores, the excited-state properties and relaxation channels to the lowest triplet state of the d5SICS:dNaM base pair in DNA have been explored at the QM/MM level in which both chromophores are treated simultaneously at MS-CASPT2//CASSCF level (see computational details). Interestingly, at the S0 minimum, the lowest four electronically excited singlet and triplet states i.e. T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) nearly exclusively originate from the local electronic excitations of the d5SICS chromophore, without visible contribution from the dNaM chromophore. At QM(MS-CASPT2)/MM level, the vertical excitation energies to T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) are estimated to be 65.7, 70.0, 71.4, and 84.7 kcal/mol, respectively. In comparison with those of the isolated d5SICS chromophore in vacuo, it can be found that the vertical excitation energies to the T1 (3 ππ ∗ ) and S2 (1 ππ ∗ ) states have considerable changes. For T1 (3 ππ ∗ ), it changes from 59.1 in vacuo to 65.7 kcal/mol in DNA; whereas, for S2 (1 ππ ∗ ), it does from 76.5 in vacuo to 84.7 kcal/mol in DNA. By contrast, the vertical excitation energies to the T2 (3 nπ ∗ ) and S1 (1 nπ ∗ ) states are very close to each other, e.g. 70.6 kcal/mol in vacuo vs. 71.4 kcal/mol in DNA for S1 (1 nπ ∗ ). Further analysis shows that the electronic structures of these four electronically excited states of d5SICS:dNaM in DNA are similar to those of d5SICS in vacuo (see Fig. 9). The minima in the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) states of d5SICS:dNaM in DNA are separately optimized at QM(CASSCF)/MM level and their schematic structures are collected in Fig. 10. Fig. 11 shows the bond-length differences of these T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (3 ππ ∗ ) minima relative to the counterparts of the S0 minimum. It can be found that these structural changes are exclusively located within the d5SICS chromophore and the contribution of the dNaM chromophore is negligible (see the panel b of Fig. 11). This is consistent with the electronic structure characters of these electronically excited singlet and triplet states (see above). Compared with the S0 minimum, in the S2 (1 ππ ∗ ) minimum, the S1-C2, C6-C7, C2-C7, and C7-C8 bond lengths change considerably (in particular, the S1-C2 and C2-C7 bonds, more than 0.1 Å); in the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ) minima, the S1-C2 bond length has the largest change and the other bond lengths vary modestly. 15 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 11: CASSCF computed bond-length differences (in Å) of the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) minima of the d5SICS:dNaM base pair in DNA relative to the counterparts of the S0 minimum. On the energetic side, the potential energies of the T1 (3 ππ ∗ ), T2 (3 nπ ∗ ), S1 (1 nπ ∗ ), and S2 (1 ππ ∗ ) minima relative to that of the S0 minimum are calculated to be 60.7, 64.7, 64.9, and 70.6 kcal/mol at QM(MS-CASPT2)/MM level, respectively. On the basis of the MS-CASPT2//CASSCF computed linearly interpolated internal coordinate paths, as shown in Fig. 12, it can be found that the excited-state relaxation pathways of d5SICS:dNaM in DNA that populate the lowest T1 (3 ππ ∗ ) triplet state from the initially populated S2 (1 ππ ∗ ) state is similar to those of the isolated d5SICS chromophore. There also exist three main excited-state decay paths. The first one is from S2 (1 ππ ∗ ) via S1 (1 nπ ∗ ) to T1 (3 ππ ∗ ); the second one is from S2 (1 ππ ∗ ) via T2 (3 nπ ∗ ) to T1 (3 ππ ∗ ). In these paths, the involved intersystem crossing processes are allowed by the classical EI-Sayed rule 53,54 and are thus associated with the large spin-orbit couplings, which are computed to be 92.1 cm−1 for the S1 /T1 coupling at the S1 minimum and 84.2 cm−1 for the S2 /T2 coupling at the S2 minimum. In the third one, the S2 (1 ππ ∗ ) system first decays to the S1 (1 nπ ∗ ) state, which is followed by an S1 → T2 intersystem crossing to the T2 state with the nπ ∗ character. In the end, the T1 state is reached as a result of the T2 → T1 internal conversion. In comparison with the former two pathways, in the latter one, the relevant intersystem crossing process from S1 to T2 should be not efficient due to the EI-Sayed rule. 53,54 Thereby, in our MS-CASPT2 calculations, we can see a 16 ACS Paragon Plus Environment

Page 16 of 29

Page 17 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 12: MS-CASPT2 computed linearly interpolated internal coordinate paths (in kcal/mol) of the d5SICS:dNaM base pair in DNA connecting (A) the S2 (1 ππ ∗ ) and S1 (1 nπ ∗ ) minima, (B) the S1 (1 ππ ∗ ) and T1 (3 ππ ∗ ) minima, (C) the S2 (1 ππ ∗ ) and T2 (3 nπ ∗ ) minima, and (D) the T2 (3 nπ ∗ ) and T1 (3 ππ ∗ ) minima.

17 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

very small S1 /T2 coupling at the S1 minimum (6.8 cm−1 at MS-CASPT2 level, see Table S5). Of course, although both isolated d5SICS chromophore and d5SICS:dNaM base pair share similar photophysical pathways, one can still see some remarkable differences (see below).

Figure 13: Excited-state relaxation pathways of (left) the isolated d5SICS chromophore and the d5SICS:dNaM base pair in DNA, and (right) the isolated dNaM chromophore derived from the present static electronic structure calculations. See text for discussion.

Correlation with Previous Works Spectroscopically, the d5SICS chromophore has a maximum peak at 365 and 371 nm [78.3 and 77.1 kcal/mol] in aqueous and acetonitrile solution, respectively, which are close to our MS-CASPT2 computed S0 → S2 vertical excitation energy of 76.5 kcal/mol at the Franck-Condon point in gas phase. In comparison, previous MS-CASPT2 and TD-DFT/PCM calculations give a little higher S0 → S2 vertical excitation energies, which are about 97.5 and 84.4 kcal/mol, respectively. 26,27 The maximum peak of the dNaM chromophore is blue-shifted compared with that of the d5SICS one 18 ACS Paragon Plus Environment

Page 18 of 29

Page 19 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

and is measured to be 325 nm [88.0 kcal/mol] in aqueous and acetonitrile solution. On the basis of the results, this absorption peak is assigned to the S0 → S1 vertical excitation at the Franck-Condon point, whose energy is estimated to be 90.8 kcal/mol at MS-CASPT2 level in gas phase. The corresponding S0 → S1 vertical excitation energy is computed to be 98.9 and 97.5 kcal/mol in previous MS-CASPT2 and TD-DFT calculations, respectively. 26,27 It is very helpful to discuss the initially populated electronically excited singlet state because different starting singlet states always give distinct excited-state decay dynamics. One of major objectives of recent experimental studies on the d5SICS and dNaM chromophores is exploring their photostability upon irradiation of near-visible light, so 390 and 325 nm [73.3 and 88.0 kcal/mol] excitations are used for the d5SICS and dNaM chromophores in experiments, respectively. 25 In this situation, the S2 state of d5SICS and the S1 state of dNaM are populated with the largest transition probability. Experimentally, it has been observed that the d5SICS and dNaM chromophores have much high quantum yields for the formation of the triplet state and singlet oxygen species. 25 This implies that there are efficient nonadiabatic relaxation pathways available for the population of the triplet state from the initially populated excited singlet state. Because d5SICS and dNaM have different initially populated excited singlet states, i.e. S2 for the former vs. S1 for the latter, their excited-state relaxation dynamics should differ from each other. On the basis of our computational results, the main relaxation pathways of the d5SICS chromophore to populate the lowest triplet state is shown in the panel a of Fig. 13. In the Franck-Condon region, the S2 state is a spectroscopically bright 1 ππ ∗ singlet state. Once the S2 system relaxes to its nearby S2 (1 ππ ∗ ) minimum, there exist two efficient nonadiabatic decay channels. In the first pathway, the S2 system decays to the S1 (1 nπ ∗ ) state via an efficient internal conversion process in the Franck-Condon region, where both S2 and S1 has a very small energy gap, ca. 1.2 kcal/mol at MS-CASPT2 level (see the panel a of Fig. 5). The S1 system, followed by an efficient S1 → T1 intersystem crossing process, arrives at the T1 (3 ππ ∗ ) state. This process is significantly enhanced by the large S1 /T1 spin-orbit coupling, 90.1 cm−1 at MS-CASPT2 level in Table S5. In the second pathway, the initially populated S2 (1 ππ ∗ ) state first hops to the T2 (3 nπ ∗ ) state via the S2 → T2 intersystem crossing process in the

19 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Franck-Condon region (see the panel c of Fig. 5). This radiationless is assisted by the large S2 /T2 spin-orbit coupling, 83.0 cm−1 at MS-CASPT2 level at the S2 minimum (see Table S5). In the T2 (3 nπ ∗ ) state, there is a large probability for the system to decay to the T1 state through internal conversion because of small energy gap (5.1 kcal/mol, see the panel d of Fig. 5). In previous theoretical study, it is suggested that the S1 state should first hop to the T2 state, which is then followed by the T2 → T1 internal conversion. This pathway is also available in our calculations. Even though both S1 (1 nπ ∗ ) and T2 (3 nπ ∗ ) are energetically close to each other in the panel b of Fig. 5, the corresponding S1 /T2 spin-orbit coupling at the present MS-CASPT2//CASSCF computational level is really small because these two electronic states share similar electronic structure character, which complies with the classical EI-Sayed rule 53,54 . Differing from the situation of the d5SICS chromophore, the S1 (1 ππ ∗ ) state of dNaM is first populated in the Franck-Condon region upon 325 nm excitation. In the S1 (1 ππ ∗ ) state, the system can decay to the T2 (3 ππ ∗ ) state via the S1 → T2 intersystem crossing process. This process benefits from the small energy gap between S1 and T2 in an extended region as shown in the panel a of Fig. 8. However, this process should be not as much efficient as those in d5SICS due to the very small S1 /T2 spin-orbit coupling at the S1 minimum (see Table S6). This could explain why the quantum yield of the formation of triplet states in dNaM is smaller than that in d5SICS, ca. 0.28 vs. ca. 0.85 in aqueous solution. 25 In the T2 state, the system will survive for a relatively long time because there is a very large energy gap between T2 and T1 along the LIIC path connecting both T2 and T1 minima (see the panel b of Fig. 8). Finally, the T1 state can be reached by the following internal conversion in triplet manifolds. Our present results are also consistent with recent experiments in which high quantum yields for the formation of triplet states as well as for the formation of singlet oxygen species are observed. 25 In previous computations, the S1 state of dNaM is suggested to first hop to the T3 state because the T3 state is predicted to be lower than the S1 state at the Franck-Condon point, 97.5 vs. 92.7 kcal/mol at the TD-PBE0 level. 27 In our present MS-CASPT2//CASSCF calculations, the T3 state is found to be higher than the S1 state in energy either in the Franck-Condon point or between their minimum-energy structures (see Table S2 and S4). In fact, previous EOM-CCSD calculations also predict the T3 state

20 ACS Paragon Plus Environment

Page 20 of 29

Page 21 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

a little higher than the S1 state (4.42 vs. 4.51 eV in Table S3 in the original work). 27 Of course, one should note that although different methods give distinct state ordering, the energy gap between S1 and T3 remains small. Importantly, either TD-DFT or MS-CASPT2 method predicts that these two states are of ππ ∗ character, so their spin-orbit couplings are small according to the classical EI-Sayed rule. 53,54 The most interesting finding in this work is that the d5SICS:dNaM base pair has similar spectroscopic properties as those of isolated d5SICS chromophore. The main geometric and electronic structure changes mainly take place within the d5SICS chromophore; in contrast, there is negligible structural changes related to the dNaM chromophore in different excited states (see the panel b of Fig. 11). This can be easily understood taking into account that the base pairing is maintained only by the hydrophobic interaction, unlike the situation in the natural base pairs such as AT and CG in which the base pairing is enhanced by the hydrogen-bonding interaction. Moreover, one can find that the DNA surroundings have small influence on the spectroscopic properties of d5SICS. For example, the vertical excitation energies to the S2 (1 ππ ∗ ) and T1 (3 ππ ∗ ) states are a little increased compared with those in gas phase. On the side of excited-state relaxation mechanism, the d5SICS:dNaM base pair in DNA shares analogous decay pathways as those in isolated d5SICS chromophore without the base-pairing dNaM chromophore (see the panel a of Fig. 13). In the first pathway, the S2 system first decays to the S1 state. This internal conversion process becomes a little slower in DNA than that in gas phase due to the larger energy gap between S2 and S1 (ca. 5 kcal/mol in DNA; see the panel a of Fig. 12). However, the following S1 → T1 intersystem crossing process is accelerated in DNA because of the smaller energy gap between S1 and T1 (3.2 kcal/mol in DNA vs. 7.7 kcal/mol in gas phase). One should note that the S1 /T1 spin-orbit coupling merely changes slightly from gas phase to DNA surroundings (90.1 vs. 92.1 cm−1 ; see Table S5). In the second path, the first-step S2 → T2 intersystem crossing process becomes slower in DNA than that in gas phase due to the increase of the S2 -T2 energy gap (4.6 kcal/mol in DNA vs. 1.8 kcal/mol in gas phase). This slowdown is counterbalanced by the following internal conversion process from the T2 to T1 state because of the smaller energy gap in DNA. Similarly, in the third pathway, the first step i.e. the S2 → S1 internal conversion state becomes slower in DNA,

21 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

which is offset by the accelerated second step i.e. the S1 → T2 intersystem crossing process. However, whatever in DNA or in gas phase, the S1 /T2 spin-orbit coupling is very small due to the fact that both states share the similar electronic structure nature, i.e. from the same n → π electronic excitation but with different spin orientations. This is consistent with the statement of the classical EI-Sayed rule. 53,54

Conclusions In this work we have employed the MS-CASPT2//CASSCF and QM(MS-CASPT2//CASSCF)/MM methods to systematically explore the spectroscopic properties, geometric and electronic structures, and excited-state relaxation pathways of the isolated d5SICS and dNaM chromophores in gas phase, and the d5SICS:dNaM base pair in DNA. 25 On the basis of our present results, we have found that upon near-visible light excitation, the S2 state of d5SICS, the S1 state of dNaM, and the S2 state of d5SICS:dNaM are first populated. Mechanistically, there are three comparable radiationless relaxation pathways to populate the lowest triplet state of the isolated d5SICS chromophore and the d5SICS:dNaM base pair in DNA; but, for dNaM, only a main nonradiative relaxation pathway is uncovered to decay the S1 system to the triplet-state space. Moreover, we have found that the excited-state relaxation mechanism of d5SICS:dNaM in DNA is similar to that of the isolated d5SICS chromophore in gas phase. In DNA, the dNaM chromophore plays a spectator and does not contribute visibly to the spectroscopic properties, geometric and electronic structures, and excited-state relaxation pathways of d5SICS:dNaM in DNA. These new mechanistic insights help understanding the photophysics and photochemistry of semisynthetic base pairs and designing better ones with high photostability.

Acknowledgments This work was supported by National Natural Science Foundation of China (21522302, 21520102005, and 21421003); G.C. is also grateful for financial support from the "Fundamental Research Funds for Central Universities".

22 ACS Paragon Plus Environment

Page 22 of 29

Page 23 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Author Information Corresponding Author Tel: +86-010-58806770 (G.C.)

Supporting Information Active orbitals in CASSCF and MS-CASPT2 computations, additional figures and tables, and Cartesian coordinates of all optimized structures.

References (1) Watson, J. D.; Crick, F. H. C. Molecular Structure of Nucleic Acids-a Structure for Deoxyribose Nucleic Acid. Nature 1953, 171, 737–738. (2) Leconte, A. M.; Hwang, G. T.; Matsuda, S.; Capek, P.; Hari, Y.; Romesberg, F. E. Discovery, Characterization, and Optimization of an Unnatural Base Pair for Expansion of the Genetic Alphabet. J. Am. Chem. Soc. 2008, 130, 2336–2343. (3) Malyshev, D. A.; Seo, Y. J.; Ordoukhanian, P.; Romesberg, F. E. PCR with an Expanded Genetic Alphabet. J. Am. Chem. Soc. 2009, 131, 14620–14621. (4) Seo, Y. J.; Romesberg, F. E. Major Groove Derivatization of an Unnatural Base Pair. Chembiochem 2009, 10, 2394–2400. (5) Seo, Y. J.; Hwang, G. T.; Ordoukhanian, P.; Romesberg, F. E. Optimization of an Unnatural Base Pair toward Natural-Like Replication. J. Am. Chem. Soc. 2009, 131, 3246–3252. (6) Seo, Y. J.; Matsuda, S.; Romesberg, F. E. Transcription of an Expanded Genetic Alphabet. J. Am. Chem. Soc. 2009, 131, 5046–5047.

23 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(7) Malyshev, D. A.; Pfaff, D. A.; Ippoliti, S. I.; Hwang, G. T.; Dwyer, T. J.; Romesberg, F. E. Solution Structure, Mechanism of Replication, and Optimization of an Unnatural Base Pair. Chem. Eur. J. 2010, 16, 12650–12659. (8) Seo, Y. J.; Malyshev, D. A.; Lavergne, T.; Ordoukhanian, P.; Romesberg, F. E. Site-Specific Labeling of DNA and RNA Using an Efficiently Replicated and Transcribed Class of Unnatural Base Pairs. J. Am. Chem. Soc. 2011, 133, 19878–19888. (9) Lavergne, T.; Malyshev, D. A.; Romesberg, F. E. Major Groove Substituents and Polymerase Recognition of a Class of Predominantly Hydrophobic Unnatural Base Pairs. Chem. Eur. J. 2012, 18, 1231–1239. (10) Malyshev, D. A.; Dhami, K.; Quach, H. T.; Lavergne, T.; Ordoukhanian, P.; Torkamani, A.; Romesberg, F. E. Efficient and Sequence-Independent Replication of DNA Containing a Third Base Pair Establishes a Functional Six-Letter Genetic Alphabet. Proc. Natl. Acad. Sci. USA 2012, 109, 12005–12010. (11) Craig, J. M.; Laszlo, A. H.; Derrington, I. M.; Ross, B. C.; Brinkerhoff, H.; Nova, I. C.; Doering, K.; Tickman, B. I.; Svet, M. T.; Gundlach, J. H. Direct Detection of Unnatural DNA Nucleotides dNaM and d5SICS Using the MspA Nanopore. PLOS ONE 2015, 10, e0143253. (12) Riedl, J.; Ding, Y.; Fleming, A. M.; Burrows, C. J. Identification of DNA Lesions Using a Third Base Pair for Amplification and Nanopore Sequencing. Nat. Commun. 2015, 6, 8807. (13) Dhami, K.; Malyshev, D. A.; Ordoukhanian, P.; Kubelka, T.; Hocek, M.; Romesberg, F. E. Systematic Exploration of a Class of Hydrophobic Unnatural Base Pairs Yields Multiple New Candidates for the Expansion of the Genetic Alphabet. Nucleic Acids Res. 2014, 42, 10235–10244. (14) Malyshev, D. A.; Dhami, K.; Lavergne, T.; Chen, T. J.; Dai, N.; Foster, J. M.; Corrêa, I. R.; Romesberg, F. E. A Semi-Synthetic Organism with an Expanded Genetic Alphabet. Nature 2014, 509, 385–388.

24 ACS Paragon Plus Environment

Page 24 of 29

Page 25 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(15) Malyshev, D. A.; Romesberg, F. E. The Expanded Genetic Alphabet. Angew. Chem. Int. Ed. 2015, 54, 11930–11944. (16) Betz, K.; Malyshev, D. A.; Lavergne, T.; Welte, W.; Diederichs, K.; Romesberg, F. E.; Marx, A. Structural Insights into DNA Replication without Hydrogen Bonds. J. Am. Chem. Soc. 2013, 135, 18637–18643. (17) Kimoto, M.; Yamashige, R.; Matsunaga, K.-i.; Yokoyama, S.; Hirao, I. Generation of High-Affinity DNA Aptamers Using an Expanded Genetic Alphabet. Nat. Biotechnol. 2013, 31, 453–457. (18) Lavergne, T.; Degardin, M.; Malyshev, D. A.; Quach, H. T.; Dhami, K.; Ordoukhanian, P.; Romesberg, F. E. Expanding the Scope of Replicable Unnatural DNA: Stepwise Optimization of a Predominantly Hydrophobic Base Pair. J. Am. Chem. Soc. 2013, 135, 5408–5419. (19) Li, Z. T.; Lavergne, T.; Malyshev, D. A.; Zimmermann, J.; Adhikary, R.; Dhami, K.; Ordoukhanian, P.; Sun, Z.; Xiang, J.; Romesberg, F. E. Site-Specifically Arraying Small Molecules or Proteins on DNA Using An Expanded Genetic Alphabet. Chem. Eur. J. 2013, 19, 14205–14209. (20) Zhang, Y. K.; Lamb, B. M.; Feldman, A. W.; Zhou, A. X.; Lavergne, T.; Li, L. J.; Romesberg, F. E. A Semisynthetic Organism Engineered for the Stable Expansion of the Genetic Alphabet. Proc. Natl. Acad. Sci. USA 2017, 114, 1317–1322. (21) Middleton, C. T.; de La Harpe, K.; Su, C.; Law, Y. K.; Crespo-Hernández, C. E.; Kohler, B. DNA Excited-State Dynamics: From Single Bases to the Double Helix. Ann. Rev. Phys. Chem. 2009, 60, 217–239. (22) Pollum, M.; Martínez-Fernández, L.; Crespo-Hernández, C. E. Photochemistry of Nucleic Acid Bases and Their Thio- and Aza-Analogues in Solution; Top. Curr. Chem.; Springer International Publishing, 2014; Vol. 355; pp 245–327. (23) Schreier, W. J.; Gilch, P.; Zinth, W. Early Events of DNA Photodamage. Annu. Rev. Phys. Chem. 2015, 66, 497–519. 25 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(24) Improta, R.; Santoro, F.; Blancafort, L. Quantum Mechanical Studies on the Photophysics and the Photochemistry of Nucleic Acids and Nucleobases. Chem. Rev. 2016, 116, 3540–3593. (25) Pollum, M.; Ashwood, B.; Jockusch, S.; Lam, M.; Crespo-Hernández, C. E. Unintended Consequences of Expanding the Genetic Alphabet. J. Am. Chem. Soc. 2016, 138, 11457–11460. (26) Ashwood, B.; Pollum, M.; Crespo-Hernández, C. E. Can a Six-Letter Alphabet Increase the Likelihood of Photochemical Assault to the Genetic Code? Chem. Eur. J. 2016, 22, 16648–16656. (27) Bhattacharyya, K.; Datta, A. Visible Light Mediated Excited State Relaxation in Semi-Synthetic Genetic Alphabet: d5SICS and dNaM. Chem. Eur. J. 2017, 23, 1–6. (28) Case, D. A.; Berryman, J. T.; Betz, R. M.; Cerutti, D. S.; Cheatham, T. E.; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W. et al. AMBER 2015. University of California, San Francisco,2015. (29) Wang, J. M.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–1174. (30) Cheatham, T. E.; Cieplak, P.; Kollman, P. A Modified Version of the Cornell et al.Force Field with Improved Sugar Pucker Phases and Helical Repeat. J. Biomol. Struct. Dyn. 1999, 16, 845–862. (31) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926–935. (32) Andersen, H. C. Molecular Dynamics Simulations at Constant Pressure and/or Temperature. J. Chem. Phys. 1980, 72, 2384–2393. (33) Berendsen, H. J. C.; Postma, J. P. M.; Vangunsteren, W. F.; Dinola, A.; Haak, J. R. MolecularDynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684–3690. (34) Warshel, A.; Levitt, M. Theoretical Studies of Enzymic Reactions - Dielectric, Electrostatic and Steric Stabilization of Carbonium-Ion in Reaction of Lysozyme. J. Mol. Biol. 1976, 103, 227–249. 26 ACS Paragon Plus Environment

Page 26 of 29

Page 27 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(35) Field, M. J.; Bash, P. A.; Karplus, M. A Combined Quantum Mechanical and Molecular Mechanical Potential for Molecular Dynamics Simulations. J. Comput. Chem. 1990, 11, 700–733. (36) Hu, H.; Yang, W. Free Energies of Chemical Reactions in Solution and in Enzymes with Ab Initio Quantum Mechanics/Molecular Mechanics Methods. Annu. Rev. Phys. Chem. 2008, 59, 573–601. (37) Senn, H. M.; Thiel, W. QM/MM Methods for Biomolecular Systems. Angew. Chem. Int. Ed. 2009, 48, 1198–1229. (38) Andersson, K.; Malmqvist, P.-Å.; Roos, B. O.; Sadlej, A. J.; Wolinski, K. Second-Order Perturbation Theory with a CASSCF Reference Function. J. Phys. Chem. 1990, 94, 5483–5488. (39) Andersson, K.; Malmqvist, P.-Å.; Roos, B. O. Second-Order Perturbation Theory with a Complete Active Space Self-Consistent Field Reference Function. J. Chem. Phys. 1992, 96, 1218–1226. (40) Bakowies, D.; Thiel, W. Hybrid Models for Combined Quantum Mechanical and Molecular Mechanical Approaches. J. Phys. Chem. B 1996, 100, 10580–10594. (41) Aquilante, F.; Lindh, R.; Pedersen, T. B. Unbiased Auxiliary Basis Sets for Accurate Two-Electron Integral Approximations. J. Chem. Phys. 2007, 127, 114107–114713. (42) Forsberg, N.; Malmqvist, P.-Å. Multiconfiguration Perturbation Theory with Imaginary Level Shift. Chem. Phys. Lett. 1997, 274, 196–204. (43) Ghigo, G.; Roos, B. O.; Malmqvist, P.-Å. A Modified Definition of the Zeroth-Order Hamiltonian in Multiconfigurational Perturbation Theory (CASPT2). Chem. Phys. Lett. 2004, 396, 142–149. (44) Heβ , B. A.; Marian, C. M.; Wahlgren, U.; Gropen, O. A Mean-Field Spin-Orbit Method Applicable to Correlated Wavefunctions. Chem. Phys. Lett. 1996, 251, 365–371. (45) Marian, C. M.; Wahlgren, U. A New Mean-Field and ECP-based Spin-Orbit Method. Applications to Pt and PtH. Chem. Phys. Lett. 1996, 251, 357–364.

27 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(46) Ditchfield, R.; Hehre, W. J.; Pople, J. A. Self-Consistent Molecular-Orbital Methods. IX. An Extended Gaussian-Type Basis for Molecular-Orbital Studies of Organic Molecules. J. Chem. Phys. 1971, 54, 724–728. (47) Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.; Gordon, M. S.; DeFrees, D. J.; Pople, J. A. Self-Consistent Molecular Orbital Methods. XXIII. A Polarization-Type Basis Set for Second-Row Elements. J. Chem. Phys. 1982, 77, 3654–3665. (48) Karlström, G.; Lindh, R.; Malmqvist, P.-Å.; Roos, B. O.; Ryde, U.; Veryazov, V.; Widmark, P.-O.; Cossi, M.; Schimmelpfennig, B.; Neogrady, P. et al. MOLCAS: A Program Package for Computational Chemistry. Comput. Mater. Sci. 2003, 28, 222–329. (49) Aquilante, F.; De Vico, L.; Ferré, N.; Ghigo, G.; Malmqvist, P.-Å.; Neogrády, P.; Pedersen, T. B.; Pitoňák, M.; Reiher, M.; Roos, B. O. et al. MOLCAS 7: The Next Generation. J. Comput. Chem. 2010, 31, 224–247. (50) Chang, X.-P.; Xie, X.-Y.; Lin, S.-Y.; Cui, G. L. QM/MM Study on Mechanistic Photophysics of Alloxazine Chromophore in Aqueous Solution. J. Phys. Chem. A 2016, 120, 6129–6136. (51) Xie, B.-B.; Wang, Q.; Guo, W.-W.; Cui, G. L. Excited-State Decay Mechanism of 2,4-Dithiothymine in Gas Phase, Microsolvated Surrounding, and Aqueous Solution. Phys. Chem. Chem. Phys. 2017, 19, 7689–7698. (52) Ponder, J. W.; Richards, F. M. An Efficient Newton-Like Method for Molecular Mechanics Energy Minimization of Large Molecules. J. Comput. Chem. 1987, 8, 1016–1024. (53) EI-Sayed, M. A. Spin-Orbit Coupling and the Radiationless Processes in Nitrogen Heterocyclics. J. Chem. Phys. 1963, 38, 2834–2838. (54) EI-Sayed, M. A. Triplet State-Its Radiative and Nonradiative Properties. Accounts Chem. Res. 1968, 1, 8–161.

28 ACS Paragon Plus Environment

Page 28 of 29

Page 29 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

29 ACS Paragon Plus Environment