Mapping the Plasticity of the Escherichia coli Genetic Code with

Apr 18, 2018 - School of Biomedical Engineering, Colorado State University, Fort Collins , Colorado 80523 , United States. Biochemistry ... However, t...
0 downloads 5 Views 2MB Size
Subscriber access provided by UNIV OF DURHAM

Mapping the Plasticity of the E. coli Genetic Code with Orthogonal Pair Directed Sense Codon Reassignment Margaret A Schmitt, Wil Biddle, and John Domenic Fisk Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00177 • Publication Date (Web): 18 Apr 2018 Downloaded from http://pubs.acs.org on April 18, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Mapping the Plasticity of the E. coli Genetic Code with Orthogonal Pair Directed Sense Codon Reassignment Margaret A. Schmitt1, Wil Biddle2 and John D. Fisk1,2,3,* 1

Department of Chemical and Biological Engineering, Colorado State University, Fort Collins,

Colorado, 80523, United States 2

Department of Chemistry, Colorado State University, Fort Collins, Colorado, 80523, United

States 3

School of Biomedical Engineering, Colorado State University, Fort Collins, Colorado, 80523,

United States * To whom correspondence should be addressed. Tel: 1-970-491-4115 (office); Fax: 1-970-4917369; Email: [email protected]

1 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 46

ABSTRACT. The relative quantitative importance of the factors that determine the fidelity of translation is largely unknown, which makes predicting the extent to which the degeneracy of the genetic code can be broken challenging. Our strategy of using orthogonal tRNA/aminoacyl tRNA synthetase pairs to precisely direct the incorporation of a single amino acid in response to individual sense and nonsense codons provides a suite of related data with which to examine the plasticity of the code. Each directed sense codon reassignment measurement is an in vivo competition experiment between the introduced orthogonal translation machinery and the natural machinery in E. coli. This report discusses 20 new, related genetic codes, in which a targeted E. coli wobble codon is reassigned to tyrosine utilizing the orthogonal tyrosine tRNA/aminoacyl tRNA synthetase pair from Methanocaldococcus jannaschii. One at a time, reassignment of each targeted sense codon to tyrosine is quantified in cells by measuring the fluorescence of GFP variants in which the essential tyrosine residue is encoded by a non-tyrosine codon. Significantly, every wobble codon analyzed may be partially reassigned with efficiencies ranging from 0.8% to 41%. The accumulation of the suite of data enables a qualitative dissection of the relative importance of the factors affecting the fidelity of translation. While some correlation was observed between sense codon reassignment and either competing endogenous tRNA abundance or changes in aminoacylation efficiency of the altered orthogonal system, no single factor appears to predominately drive translational fidelity. Evaluation of relative cellular fitness in each of the 20 quantitatively-characterized proteome-wide tyrosine substitution systems suggests that at a systems level, E. coli is robust to missense mutations.

2 ACS Paragon Plus Environment

Page 3 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The expansion of the genetic code at sense codons is hampered by an incomplete understanding of the relative quantitative importance of the factors that affect the fidelity of translation.1, 2 We describe a new application of orthogonal translation machinery to dissect the functioning of the Escherichia coli (E. coli) translational apparatus with an eye towards future applications in genetic code expansion. We have repurposed the Methanocaldococcus jannaschii (M. jannaschii) tyrosine tRNA/aminoacyl tRNA synthetase (aaRS) pair, variants of which are widely employed to insert non-canonical amino acids into proteins in response to stop codons, to insert tyrosine in response to 20 of 21 sense codons read via wobble interactions in E. coli.3, 4 This report focuses on achieving a deeper understanding of the factors influencing the fidelity of translation using an orthogonal pair directed sense codon reassignment approach. This approach facilitates experimental exploration of the cellular fitness penalties of precise alterations of the E. coli genetic code and enables dissection of the contributions of tRNA aminoacylation, tRNA concentrations, and codon-anticodon interaction energies to the fidelity of translation. The extent of sense codon reassignment is quantified by measuring the restoration of fluorescence in green fluorescent protein (GFP) reporter variants in which the essential tyrosine residue is encoded by a non-tyrosine codon (Figure 1).5 The screen integrates the effects of orthogonal M. jannaschii tyrosyl aaRS recognition and aminoacylation of the M. jannaschii tRNA species with an alternative anticodon and the effects of competition between the altered orthogonal tRNA and E. coli tRNA species to decode the codon specifying the essential tyrosine position of GFP. Genetic code expansion is constrained by the fact that all 64 codons have an assigned function. However, the genetic code is degenerate in that 61 sense codons are decoded in E. coli by tRNAs with 40 anticodon sequences and used to specify just 20 canonical amino acids. The interactions between tRNA and aaRS molecules that drive the fidelity of the genetic code have only partially

3 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 46

been mapped, and the space between these interactions, the extent to which additional orthogonal tRNA/aaRS pairs can be added, is largely unknown. While the amino acid specified by each triplet of nucleotides has not changed during evolution, genome compositions, codon usage frequencies, and the complements of adapter tRNA molecules used to translate the code have all diverged considerably.6, 7 The fact that codon usage varies widely across organisms and that different species employ different sets of tRNAs to decode their genomes implies the existence of a fair degree of plasticity in the machinery that specifies the genetic code.8-10

Figure 1. Principle of the fluorescence-based screen for sense codon reassignment. The screen monitors the ability of an introduced orthogonal tRNA to incorporate tyrosine in response to a sense codon typically assigned another meaning in the genetic code. Residues 65-67 of superfolder GFP specify the Thr-Tyr-Gly sequence that autocatalytically folds into the tripeptide fluorophore. Incorporation of tyrosine in response to a non-tyrosine sense codon included at position 66 of a superfolder GFP protein variant leads to restoration of the fluorophore. Replacement of Tyr at position 66 with any other natural amino acid effectively abolishes the fluorescence of the protein. Reassigning the meaning of sense codons by breaking the degeneracy of the genetic code has the potential to expand the genetic code to 22 (or more) amino acids, greatly increasing the encodable properties of proteins. A handful of reports discuss incorporation of selected noncanonical amino acids in response to individual sense codons. The original demonstration of breaking the degeneracy of the genetic code reassigned the Phe UUU codon, leaving the UUC 4 ACS Paragon Plus Environment

Page 5 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

codon available for incorporation of phenylalanine.11 The UUU and UUC codons are decoded by a single tRNA species with a GAA anticodon sequence in E. coli. Codons read via wobble interactions, e.g. Phe UUU, are targeted for reassignment based on the idea that differences in codon-anticodon binding energies between introduced orthogonal tRNAs capable of Watson— Crick base pairing and endogenous tRNAs that utilize wobble base pairing can be harnessed to bias the incorporation of amino acids.12 Efficient incorporation of ncAAs in response to the rarely used Ser AGU, Leu UUG, Ile AUA, Arg AGA, and Arg AGG codons have been described.5, 13-19 Rare codons are often read by low abundance tRNA species that are expected to be easier to outcompete.20, 21 Related experiments have investigated low level incorporation of reactive ncAAs to selectively target newly synthesized proteins for visualization or isolation and characterization.18, 22-24 A major barrier to broadly expanding the genetic code is understanding and appropriately manipulating the systems level interactions between the introduced orthogonal species and the endogenous translation machinery that control the fidelity of the genetic code, e.g. relative tRNA concentrations, aminoacylation efficiency, codon-anticodon interaction energies, and tRNA modifications. The 20 codons discussed in this evaluation are related in that the anticodons of the introduced orthogonal tRNAs are expected to have improved energetic interactions with the targeted codons compared to the competing endogenous tRNAs. The anticodons of the orthogonal tRNAs are engineered to Watson—Crick base pair with the targeted codons; competing tRNAs decode the targeted codon through less energetically favorable wobble interactions. 18 of the 20 targeted codons are decoded by E. coli tRNAs via either a G34/U3 or U34/G3 wobble; the CGC and CGA Arg codons are decoded by the E. coli Arg2 tRNA with an inosine at position 34 (I34/C3 or I34/A3). N34 refers to the identity of the nucleotide in the first

5 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 46

anticodon position of the tRNA (for tRNA numbering, refer to Figure S1). N3 refers to the identity of the nucleotide in the third position of the mRNA codon targeted for reassignment, also called the wobble position. The efficiency of aminoacylation of the orthogonal tRNA as a function of changing the anticodon is expected to vary across the 20 codons evaluated. Likewise, the concentration of endogenous tRNAs competing to decode the targeted codon varies. The reassignment systems targeting each sense codon differ only in the anticodon of the orthogonal tRNA and the codon specifying the fluorophore tyrosine position in the reporter. These differences are expected to have a minimal effect on the amount of orthogonal tRNA produced and on the amount of GFP mRNA present. The set of orthogonal pair directed, partial genetic code reassignments of individual sense codons to tyrosine reported here offer a precise tool to investigate the plasticity of the genetic code. We examine the efficiency of reassignment of specific codons in relation to experimentally-based predictions of aminoacylation efficiency of the orthogonal tRNAs with altered anticodons and the concentrations of competing tRNAs. We find that the E. coli translational apparatus is highly balanced, with no single factor predicting the efficiency of codon reassignment. We evaluate the ability of the introduced orthogonal tRNAs to decode only the targeted codon, leaving closely-related codons available to encode canonical amino acids. The observed levels of discrimination largely follow the predictions of the wobble rules. Some anomalous decoding properties of orthogonal tRNA species suggest modification of the anticodon by endogenous E. coli enzymes. The extent to which the M. jannaschii tyrosyl orthogonal pair can be used to infiltrate the genetic code of E. coli at wobble codons has additional practical implications. Measurements of the global effects of translation system modifications where the reading of specific sense codons 6 ACS Paragon Plus Environment

Page 7 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

is altered, including the extent to which a given sense codon is naturally subject to errors, contribute to understanding the molecular mechanisms of antibiotic function, translation-related diseases, and certain cancers.1, 25, 26 Measured cellular growth effects complement less specific directed missense incorporation studies which utilize aaRSs with defective editing domains27-34 or stringent and relaxed ribosome modifications.35 The measured reassignment efficiencies of the various codon-specific, orthogonal pair directed sense codon reassignment systems extrapolated to the proteome suggest on the order of 100,000s to millions of missense incorporations, an approximate 100% to 1000% increase in the number of missense errors occurring as a result of normal cellular background missense incorporation over the course of a single cell generation. The cells appear to be largely tolerant of these missense errors; only a weak correlation between the estimated number of proteome-wide replacements and overall system fitness is observed. MATERIALS AND METHODS The fluorescence-based screen for sense codon reassignment has been described.5 Detailed experimental protocols, including methods for vector construction and mutagenesis, measurement of system growth and fluorescence, calculation of sense codon reassignment efficiencies, and calculation of system fitness are included in the Supporting Information. The Supporting Information also includes oligonucleotide primer sequences, detailed cell strain information, and general reagents and materials. RESULTS The efficiency of sense codon reassignment was evaluated for 20 of 21 codons read via wobble interactions by E. coli tRNAs using a fluorescence-based screen. The E. coli wobble codons may be grouped into three classes based on codon-anticodon interactions. The majority of E. coli 7 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 46

tRNAs that read codons through wobble base pairing, 15 of 21, employ a guanosine or the modified G, queosine, in position 34 of the anticodon. The tRNA species with G/Q34 read both C- and U-ending codons in 15 of 16 codon boxes. The single exception for decoding C- and Uending codons in E. coli is the arginine codon box in which a tRNA with a modified A34 reads the CGU, CGC, and CGA codons. To reassign the meaning of 14 of 15 U-ending wobble codons, an orthogonal tRNA containing an appropriate Watson—Crick base pairing A34 anticodon is supplied to compete with the endogenous G/Q34 E. coli tRNA. Estimates of the energetic difference between G/U wobble and A/U Watson—Crick pairs in RNA suggest that A/U pairs are energetically preferred to G/U pairs by approximately 1.5-2.0 kcal/mol.12 Reassignment of the tyrosine UAU codon (decoded by an E. coli tRNA with a QUA anticodon) cannot be evaluated with this screen. The second set of E. coli wobble-reading tRNAs contain a modified uracil at position 34 and read both A- and G- ending codons. This group includes tRNAs that decode valine GUA and GUG, alanine GCA and GCG, lysine AAA and AAG, and glutamic acid GAA and GAG codons. The modified uridine at position 34 is 5-methylaminomethyl-2-thiouridine for Lys and Glu tRNAs and uridine-5-oxyacetic acid for Val and Ala tRNAs. The 5-methylaminomethyl-2thiouridine at position 34 restricts decoding to the A- and G-ending codons; tRNAs with uridine5-oxyacetic acid at position 34 are also able to decode the U-ending codon. For this reason, uridine-5-oxyacetic acid is found at position 34 only in tRNAs in which the entire box encodes the same amino acid. In order to reassign G-ending wobble codons, the introduced orthogonal tRNAs contain C34 anticodons to provide an energetically-favorable C34/G3 pairing. Unmodified Watson—Crick G/C pairs are 2.0 kcal/mol more stable than G/U wobble pairs.12 Kinetic data evaluating tRNA selection suggests that there is an approximate twofold difference

8 ACS Paragon Plus Environment

Page 9 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

in the rate of selection of G/C Watson—Crick vs G/U wobble pairs, indicating that energetics may play role in fine tuning tRNA selection.36 The final two codons read via wobble interactions in E. coli are two of the three arginine 4-box codons read by the tRNA with a modified A at position 34, CGC and CGA. tRNAArg2 is the only instance of an A34 tRNA in the E. coli genetic code, and adenosine 34 is modified to inosine to decode the CGU, CGC, and CGA codons. The I/U and I/C pairing interactions are expected to be of approximately equal energy and of comparable energy to A/U pairs. The I/C pair is expected be slightly more energetically favorable as inosine-containing RNA is reverse transcribed as C rather than U.37, 38 Reading of I/A pairs has been shown to be less effective than other wobble interactions.39 An orthogonal tRNA with either a G34 or U34 was introduced to reassign the CGC and CGA codons, respectively. Principle of the Fluorescence-Based Screen The screen evaluates the extent to which an introduced orthogonal pair can reassign the meaning of a test codon encoding the essential fluorophore tyrosine position in GFP (Figure 1). The fluorescence-based screen takes advantage of the absolute requirement of tyrosine for GFP fluorophore formation. Unlike stop codon suppression in which missed incorporations lead to truncated proteins, sense codon reassignment systems produce a heterogeneous mixture of full length proteins. The observed fluorescence represents a direct measurement of the ability of the orthogonal tRNA with an altered anticodon to compete against the endogenous E. coli tRNAs to read the sense codon specifying the fluorophore tyrosine position in the GFP reporter. The measured sense codon reassignment efficiencies are expected to be a complex function of the levels of orthogonal tRNA and aaRS present, the efficiency of the interaction between the

9 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 46

orthogonal aaRS and anticodon-altered tRNAs, the amount of competing endogenous tRNA present, and the differences in codon-anticodon interaction energies between competing tRNAs. The relative efficiency of the reassignment of different codons is expected to be a map of the relative fit of a particular orthogonal system (aaRS and tRNA with an altered anticodon) for the reassignment of a given codon in the particular organism. The fluorescence-based screen provides a quantitative measure of the extent of reassignment at each test codon by bracketing the observed GFP signal for a codon reassignment measurement between a “100% fluorescence” reference value produced by expressing superfolder GFP with a tyrosine UAC codon specifying the fluorophore and a “0% fluorescence” reference value established through expressing superfolder GFP with a non-tyrosine codon specifying the fluorophore. Both the 100% and 0% fluorescence reference systems include a plasmid expressing the M. jannaschii aaRS and amber-suppressing tRNA (CUA anticodon) to maintain a metabolic burden on the cells equivalent to that of the systems under evaluation for sense codon reassignment. Variation across reassignment systems was further reduced by utilizing a suite of GFP constructs in which the DNA sequence had been optimized to remove the E. coli wobble codons. The codon specifying the fluorophore tyrosine in each GFP reporter was the only occurrence of the wobble codon under evaluation for sense codon reassignment in the GFP gene. The reassignment systems targeting each sense codon differ only in the anticodon of the orthogonal tRNA and the codon specifying the fluorophore tyrosine position in the reporter. These differences are expected to have a minimal effect on the amount of orthogonal tRNA produced and on the amount of GFP mRNA present. The sequences of the GFP reporter gene, the primers

10 ACS Paragon Plus Environment

Page 11 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

used to engineer each position 66 codon variant, and the primers used to change the sequence of the tRNA anticodon are given in the Supporting Information. Significance of Measured Sense Codon Reassignment Efficiencies The data analysis process is described in detail in the Supporting Information. Briefly, each reported reassignment efficiency represents the average of the reassignment efficiency of at least 6 individual clones. The number of biological replicates analyzed for each targeted codon is provided in Table 2. Sense codon reassignment measurements are consistent across multiple experiments on different days. The composite standard deviation in reassignment percentage for more than 50 individual “0% fluorescent” clones across 9 separate experiments is 0.05%, making the detection limit of the in cell fluorescence assay for sense codon reassignment efficiency 0.15%. The two E. coli wobble codons least efficiently reassigned by the M. jannaschii tyrosine tRNA/aaRS pair, Val GUU and Gly GGU are measured at 0.8 ± 0.1% and 1.1 ±0.1% efficiency,

above the limit of quantification of the screen.

Figure 2. Sense codon reassignment efficiency of 20 E. coli wobble codons evaluated using an in cell fluorescence-based screen. In each system, a reporter vector containing the targeted codon specifying the tyrosine residue required for GFP fluorophore formation is combined with 11 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 46

a vector containing the orthogonal M. jannaschii tyrosyl aminoacyl tRNA synthetase and tRNA with the anticodon shown in a shaded box as “Orthog anticodon”. Efficiencies are calculated relative to a “100% fluorescence control”, a GFP reporter with a tyrosine codon specifying the fluorophore. Calculated efficiencies represent the average of at least 6 biological replicates for each sense codon reassigning system. The number of biological replicates that contributes to the calculated efficiency for each targeted codon is provided in Table 2.

Each Evaluated E. coli Wobble Codon is Partially Reassignable Significantly, every E. coli wobble codon analyzed may be partially reassigned by supplying an orthogonal tRNA with an anticodon that interacts with the targeted codon through Watson— Crick base pairing interactions (Figure 2). Reassignment efficiencies range from 0.8 ± 0.1% (8 out of every 1000 incorporations at the targeted codon) to nearly 41% (40.8 ± 3.1%, 408 out of every 1000 incorporation events at the targeted codon). In addition to measurements of sense codon reassignment efficiencies at targeted codons, the discrimination between targeted and non-targeted codons by the introduced orthogonal tRNAs was evaluated (Table 1). The wobble rules suggest that codon reassignment by an introduced A34 tRNA should be strongly biased toward the targeted U-ending codon, reading codons ending in U > C >> G > A.40-42 The complement of E. coli tRNAs does not include any tRNAs with non-modified A34, and non-modified A34 tRNAs are largely absent from all kingdoms of life. The codon reading properties of the few systems containing unmodified A34 tRNAs are not well studied.43, 44 As a result, very little is known about the in vivo codon preferences of A34 tRNAs. The observed levels of discrimination between A34/U3 and A34/C3 pairs largely follow the predictions of the wobble rules and fall into three categories: orthogonal tRNAs with good discrimination, orthogonal tRNAs with strong evidence of a modification that explains the expanded codon reading, and orthogonal tRNAs with poor codon discrimination. 12 ACS Paragon Plus Environment

Page 13 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 1. Discrimination of closely-related codons by introduced orthogonal M. jannaschii tRNAs. tRNA Anticodon tRNAOptAAA tRNAOptAAG tRNAOptAAU tRNAOptAAC tRNAOptAGA tRNAOptAGG tRNAOptAGU tRNAOptAGC tRNAOptAUG tRNAOptAUU tRNAOptAUC tRNAOptACA tRNAOptACU tRNAOptACC

Amino acid targeted Phe Leu Ile Val Ser Pro Thr Ala His Asn Asp Cys Ser Gly

Reassignment observed at YZUa 3.2 ± 0.3% 4.2 ± 0.3% 4.3 ± 0.5% 0.8 ± 0.1% 19.6 ± 1.2% 12.2 ± 0.5% 10.5 ± 0.4% 1.3 ± 0.1% 6.1 ± 0.4% 7.5 ± 0.2% 3.7 ± 0.5% 3.5 ± 0.1% 9.1 ± 2.6% 1.1 ± 0.1%

Reassignment observed at YZC B.D.b 1.3 ± 0.03% 0.4 ± 0.1% 0.6 ± 0.03% 1.3 ± 0.07% 10.2 ± 0.5% 1.3 ± 0.1% 1.1 ± 0.1% 2.9 ± 0.1% B.D. B.D. B.D. B.D. 0.5 ± 0.07%

Reassignment observed at YZA -----c B.D. B.D. B.D. ----0.9 ± 0.2% ----B.D. B.D. ----B.D. -------------

Reassignment observed at YZG --------0.3 ± 0.03% 0.3 ± 0.05% ------------B.D. --------B.D. -------------

tRNAOptCAC tRNAOptCGC tRNAOptCUU tRNAOptCUC

Val Ala Lys Glu

0.3 ± 0.06% 0.3 ± 0.1% ----B.D.

B.D. 0.8 ± 0.2% ----B.D.

0.4 ± 0.04% 1.1 ± 0.2% B.D. 0.8 ± 0.08%

2.1 ± 0.1% 9.2 ± 0.6% 7.7 ± 0.7% 19.8 ± 0.7%

tRNAOptGCG Arg 7.5 ± 0.7% B.D. B.D. 7.6 ± 0.7% tRNAOptUCG Arg 0.5%± 0.05% 0.4%± 0.05% 33.0 ± 1.7% 40.8 ± 3.1% Anticodons with A or G at position 34 were typically evaluated against codons ending in U and C; anticodons with U or C at position 34 were typically evaluated against codons ending in A and G. Reassignment efficiencies emboldened and italicized are those for the codon targeted for reassignment, which fully base pairs to the orthogonal tRNA via Watson— Crick interactions. The number of biological replicates that comprise the reassignment efficiency for each targeted codon are given in Table 2. The reassignment efficiencies for each non-targeted codon are the average of between 6 and 12 biological replicates, with the exception of reassignment of the Arg CGU codon by tRNAOptGCG, for which 18 biological replicates were evaluated. a For continuity, “Y” and “Z” are used to represent the first two positions of the codon in the column headers. The sense codons evaluated with a single tRNA in each row are determined by substituting the complement of the nucleobases in the second and third anticodon positions for Y and Z. The third codon position is specified in the column headers. For example, row 1 shows sense codon reassignment by the M. jannaschii tRNAOptAAA. Data in the YZU column are for reassignment at the UUU codon. Data in the YZC column are for reassignment at the UUC codon. b B.D. indicates that the codon was evaluated with the specified tRNA, and the measurement was below the detection limit of the in cell assay (0.15%). c “-----“ indicates that the codon was not evaluated for reassignment by the specified tRNA.

DISCUSSION The extent to which the degeneracy of the genetic code can be broken is not obvious a priori, and the relative quantitative importance of the various factors that determine the fidelity of translation are not known. Utilizing an orthogonal tRNA/aaRS pair to quantify the extent to which 20 E. coli wobble codons are reassignable provides a unique data set that will contribute

13 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 46

to further unraveling the in vivo importance of tRNA abundance, aminoacylation level, elongation factor binding efficiency, tRNA modifications, and codon-anticodon interaction energy in determining translational fidelity. The following sections describe (1) an analysis of measured sense codon reassignment efficiencies with respect to expected reductions in orthogonal tRNA aminoacylation as a result of altering the sequence of the anticodon (2) an analysis of measured sense codon reassignment efficiencies with respect to competing E. coli tRNA abundance (3) a discussion of the ability of introduced orthogonal tRNAs to discriminate between targeted and non-targeted codons and (4) an analysis of the effect of sense codon reassignment on cell health as measured by growth rate and carrying capacity reductions. Presumably, sense codons which are reassigned to tyrosine with higher efficiency will be more productive targets for reassignment to non-canonical amino acids (ncAAs). Any sense codon for which tyrosine incorporation is detected represents a starting point for improvement through directed evolution. Several sense codons including, Arg CGA, Glu GAG, Ser UCU and Pro CCU, represent promising targets for incorporation of ncAAs using M. jannaschii tRNA/aaRS pairs already-evolved for incorporation of ncAAs in response to amber stop codons. Existing systems may be immediately repurposed or further optimized to incorporate ncAAs in response to sense codons. In addition to improving the interactions between the M. jannaschii tRNA with an altered anticodon and its aaRS, adjustments to orthogonal tRNA/aaRS expression levels and/or expression of competing endogenous tRNAs are also expected to contribute to improving the efficiency of sense codon reassignment. The precise thermodynamic differences in codon-anticodon interaction energies and the relative importance of codon-anticodon interaction energy on tRNA selection are not well understood. One of the guiding ideas behind genetic code expansion via breaking the degeneracy 14 ACS Paragon Plus Environment

Page 15 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

of the genetic code is that differences in codon-anticodon binding energies between introduced orthogonal tRNAs capable of Watson—Crick base pairing and endogenous tRNAs that utilize wobble base pairing can be exploited to bias the incorporation of amino acids.2, 11, 13 Sense Codon Reassignment Efficiencies Weakly Correlate with Predicted AnticodonDependent Reduction in Aminoacylation Efficiency The extent to which an amino acid is introduced into a growing protein chain depends on two sets of reactions: the charging of the amino acid to the appropriate tRNA by an aaRS and the utilization of the aminoacylated tRNA by the ribosome in translation. A great deal of effort has been devoted to understanding the particular idiosyncrasies of the recognition of tRNA molecules by their cognate aaRSs.45-47 The precise elements of tRNA structure and sequence that are recognized by an aminoacyl tRNA synthetase constitute the tRNA identity.48 For most tRNAs, important identity elements are found in the acceptor stem and anticodon stem loop. Each tRNA species contains both positive identity elements that stabilize the correct tRNA/aaRS pair and negative identity elements that destabilize incorrect tRNA/aaRS pairs. Sense codon reassignment requires that the changes made to the orthogonal tRNA to allow Watson—Crick base pairing interactions with the targeted codon do not abrogate the interactions that direct the aaRS to charge an amino acid onto the tRNA. Poor charging of an appropriate amino acid to its tRNA functionally lowers the effective concentration of aminoacylated tRNA that can compete for decoding the targeted codon. The effect of changing the orthogonal tRNA anticodon sequence on aminoacylation efficiency was estimated based on an assumption of energetic additivity in the effects of measured single base changes from a detailed experimental evaluation of the identity elements of the M.

15 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 46

jannaschii orthogonal pair.49 In the single instance where multiple simultaneous changes to the anticodon were evaluated (GUA to CAU), the observed decrease in activation of 880-fold closely approximates the predicted decrease of 1117-fold that results from combining the effects of individual single nucleotide changes in the anticodon. The base changes required to convert the M. jannaschii tRNA from a GUA anticodon to an amber suppressing CUA anticodon result in a 97-fold decrease in the recognition of the tRNA species by the aaRS. These changes are well-tolerated, as the M. jannaschii tRNA/aaRS pair is a very effective nonsense suppressor. The M. jannaschii tRNA with a CUA anticodon incorporates tyrosine in response to an amber stop codon in the fluorophore position of a GFP reporter with 86% efficiency. The efficiencies of introduction of non-canonical amino acids in response to the amber stop codon using M. jannaschii tRNA/aaRS pair variants are typically in the 10-30% range.50 The variant of the M. jannaschii tRNA used for evaluation of sense codon reassignment, tRNAOpt, was previously optimized for amber codon suppression.4, 51 The changes introduced in tRNAOpt are not expected to affect recognition by the aaRS. The expected aminoacylation efficiency of orthogonal tRNAs with altered anticodons is presented in relation to the aminoacylation efficiency of the M. jannaschii tRNA with a CUA anticodon for amber suppression (Table 2). The predicted effect of converting the M. jannaschii tRNA CUA anticodon to an anticodon that Watson—Crick base pairs with each E. coli wobble codon ranges from an estimated 3-fold improvement for tRNAOptGCG to target Arg CGC to an over 30-fold reduction for tRNAOptAAG reassigning Leu CUU and tRNAOptCAC reassigning Val GUG. Predicted absolute changes to aminoacylation efficiency as a result of changing the anticodon are presented in the supporting information (Supporting Information, Table S4).

16 ACS Paragon Plus Environment

Page 17 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 2. Competing E. coli tRNA and predicted reduction in aminoacylation efficiency of the orthogonal tRNA for E. coli wobble codons evaluated for sense codon reassignment. tRNA Anticodon

Sense codon targeted

Reassignment of targeted codona

E. coli codon usageb

Competing E. coli tRNAsc

tRNAOptAAA tRNAOptAAG tRNAOptAAU tRNAOptAAC tRNAOptAGA tRNAOptAGG tRNAOptAGU tRNAOptAGC tRNAOptAUG tRNAOptAUU tRNAOptAUC tRNAOptACA tRNAOptACU tRNAOptACC

Phe UUU Leu CUU Ile AUU Val GUU Ser UCU Pro CCU Thr ACU Ala GCU His CAU Asn AAU Asp GAU Cys UGU Ser AGU Gly GGU

3.2 ± 0.3% (6) 4.2 ± 0.3% (15) 4.3 ± 0.5% (15) 0.8 ± 0.1% (12) 19.6 ± 1.2% (15) 12.2 ± 0.5% (21) 10.5 ± 0.4% (9) 1.3 ± 0.1% (18) 6.1 ± 0.4% (12) 7.5 ± 0.2% (6) 3.7 ± 0.5% (12) 3.5 ± 0.1% (9) 9.1 ± 2.6% (12) 1.1 ± 0.1% (12)

0.57 0.10 0.51 0.26 0.15 0.16 0.17 0.16 0.57 0.45 0.63 0.45 0.15 0.34

4440 7400 17390 23680 9620 5920 9250 19240 2960 5550 11100 6660 5180 18500

Predicted fold reduction in aminoacylation efficiencyd 8.9 35.6 10.7 31.2 6.6 26.4 7.9 23.1 3.7 1.1 3.2 7.7 9.2 27.0

tRNAOptCAC tRNAOptCGC tRNAOptCUU tRNAOptCUC

Val GUG Ala GCG Lys AAG Glu GAG

2.1 ± 0.1% (12) 9.2 ± 0.6% (18) 7.7 ± 0.7% (6) 19.8 ± 0.7% (12)

0.37 0.36 0.23 0.31

17760 16280 8140 22570

33.6 24.8 1.2 3.5

tRNAOptGCG Arg CGC 7.6 ± 0.7% (12) 0.40 22200 0.3 tRNAOptUCG Arg CGA 40.8 ± 3.1% (18) 0.06 22200 3.8 a The number of biological replicates included in the average reassignment efficiency for a given sense codon reassignment system is given in parentheses. The error on each reassignment efficiency represents the standard deviation across the indicated number of biological replicates. Across more than 90 observations of 100% fluorescent reference clones, the standard deviation is 3.7%. b As a fraction of the total number of codons encoding a given canonical amino acid. Codon usage is based on http://openwetware.org/wiki/Escherichia_coli/Codon_usage. c Number of competing tRNAs calculated utilizing reported tRNA-to-ribosome ratios61 for cells with a 37 minute doubling time and an estimated 37,000 ribosomes. d Relative to the change from GUA (tyrosine) to CUA for amber stop codon suppression. The predicted reduction in aminoacylation efficiency of the M. jannaschii tyrosyl tRNA with a given anticodon is calculated based on an assumption of additivity of the effect of measured, single nucleotide mutations49

Measured sense codon reassignment efficiencies do not strongly correlate with predicted changes to aminoacylation efficiency as a result of changing the tRNA anticodon (Figure 3a). The single anticodon for which aaRS recognition is predicted to be better than that for the amber suppressor, GCG, reassigns the Arg CGC codon at 7.6 ± 0.7%, the ninth highest reassignment efficiency of the 20 wobble codons evaluated. tRNA variants with either an AUU anticodon to read Asn AAU codons or CUU anticodon to read Lys AAG codons are predicted to be 17 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 46

aminoacylated at an efficiency similar to that of the amber suppressing tRNA with a CUA anticodon. Sense codon reassignment efficiencies for tRNAOptAUU and tRNAOptCUU are effectively the same as reassignment of Arg CGC, 7.5 ± 0.2% and 7.7 ± 0.7%, respectively. These three orthogonal tRNAs are predicted to be the most effectively aminoacylated of the anticodons tested, however, none of the three tRNAs reassign their targeted codon in the top

35% of observed reassignment efficiencies. Figure 3. Analysis of sense codon reassignment efficiency in relation to predicted reduction in aminoacylation or number of competing endogenous tRNAs. A) Regression analysis of sense codon reassignment efficiencies and predicted reduction in aminoacylation of the M. jannaschii tRNA as a function of changes to the nucleotides in the anticodon. Values for reduction in aminoacylation are shown relative to the measured reduction in aminoacylation as a result of the change from a GUA anticodon (wild type) to CUA (for amber stop codon suppression).49 Green diamonds are the observed sense codon reassignment efficiencies; pink squares indicate the predicted values from a linear regression analysis. R2 = 0.146 B) Regression analysis of sense codon reassignment efficiencies and estimated number of E. coli tRNAs competing to decode the targeted codon. Green diamonds are the observed sense codon reassignment efficiencies; pink squares indicate the predicted values from a linear regression analysis. R2 = 0.04 The standard deviation of measured sense codon reassignment efficiencies for some systems is smaller than the size of the green diamond marker. Seven E. coli wobble codons are reassigned with efficiencies greater than 9%. In each of these cases, aminoacylation efficiency is predicted to be worse than aminoacylation of tRNAOptCUA for

18 ACS Paragon Plus Environment

Page 19 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

amber suppression. Five of the seven tRNAs are predicted to be aminoacylated with an efficiency 3.5-fold to 9.2-fold worse than tRNAOptCUA, within an order of magnitude of the effect of changing the GUA anticodon to a CUA anticodon. Aminoacylation of tRNAOptAGG for Pro CCU is predicted to be 26.4-fold worse than aminoacylation of tRNAOptCUA, yet reassignment of Pro CCU occurs at 12.2 ± 0.5%, the fourth highest reassignment efficiency observed. Similarly, aminoacylation of tRNAOptCGC for Ala GCG is predicted to be 24.8-fold worse than aminoacylation of tRNAOptCUA, and reassignment of Ala GCG occurs at 9.2 ± 0.6%. Although the trend suggests that sense codon reassignment efficiency decreases as aminoacylation efficiency decreases, the linear correlation is not strong. Aminoacylation efficiency is likely a contributing, but not dominant, factor in sense codon reassignment. Sense Codon Reassignment Efficiencies Weakly Correlate with Competing E. coli tRNA Abundance The second set of reactions in sense codon reassignment involves competition between the charged orthogonal tRNA and the organism’s endogenous tRNAs for function in the multiple ribosome-catalyzed steps of peptide bond formation. The incorporation rate of reprogramed vs natural amino acid is related to the rates of acceptance of the competing tRNAs that are affected by the codon-anticodon interaction,36, 52 modulated by RNA modifications and sequences controlling the precise geometry and flexibility of the anticodon loop,6, 53 and interactions of the tRNA with the ribosome outside of the codon-anticodon helix.54 Experimental information on the relative contributions of each of the above factors to amino acid incorporation efficiency is sparse.36, 55

19 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 46

The ribosome and the components of the translational apparatus have evolved to be permissive of diversity in the selection of tRNAs; this permissiveness is the result of evolutionary pressure for both approximately equal binding of aminoacylated tRNAs by EF-Tu and an approximately equivalent rate of selection of cognate codon-anticodon pairings.36, 56 The approximate equivalence of acceptance rate across chemically different amino acids and between the various codon-anticodon pairs is the result of compensating selections at multiple steps in the process. The attached amino acid and tRNA body both contribute to EF-Tu binding, and evolution has resulted in stronger interacting amino acids being attached to weaker interacting tRNA bodies.56, 57

The result of these competing interactions is that all natural aminoacylated tRNAs bind to EF-

Tu with similar affinities. Likewise the ribosome interacts with tRNAs beyond their anticodon loop to normalize the interactions between A/U rich and G/C rich codon-anticodon pairs.58, 59 The competition between multiple tRNA species that are cognate for a given codon in translation is driven by the stochastic sampling of tRNAs by the ribosome. If all of the contributing factors except concentration were equal, the expectation would be that the incorporation efficiency would be equal to the fractions of competing tRNAs. E. coli tRNA abundances vary across two orders of magnitude from hundreds to tens of thousands of a given tRNA species in a cell. Competing E. coli tRNA abundances were calculated from the experimentally-measured tRNA abundances and tRNA/ribosome ratios reported by Dong, Nilsson, and Kurland.60 tRNA/ribosome ratios for E. coli growing at 1.6 doublings per hour, the average doubling rate of the reassignment systems, were utilized for analysis (Table 2). The number of ribosomes per cell at 1.6 doublings per hour was calculated from the cell composition data of Dennis and Bremer.61 Expression levels of the orthogonal tRNAs with each altered anticodon are expected to be similar across sense codon reassignment systems.

20 ACS Paragon Plus Environment

Page 21 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The suite of sense codon reassignment measurements enables evaluation of the relationship between the abundance of competing E. coli tRNAs and sense codon reassignment efficiency. The aggregate data examining reassignment of 20 E. coli wobble codons suggests a weak correlation between reassignment efficiency and competing tRNA abundance (Figure 3b). In general, orthogonal tRNAs that compete with more abundant tRNAs have lower reassignment efficiencies, but some sense codons for which E. coli have few tRNAs are reassigned poorly (e.g. Phe UUU and Leu CUU). Conversely, tRNAOptCUC competes against one of the most abundant E. coli tRNAs (with a UUC anticodon) yet reassigns the Glu GAG codon with nearly 20% efficiency, the second highest reassignment efficiency observed for wobble codons. Reassignment of the arginine CGA codon is a particularly striking example of the effectiveness of expanding the genetic code by targeting sense codons for which the codonanticodon interactions may be markedly improved relative to the competing endogenous tRNA. tRNAOptUCG reassigns the CGA codon with 40.8 ± 3.1% efficiency, the highest measured reassignment efficiency for an E. coli wobble codon by an M. jannaschii tRNA with an altered anticodon. In this case, the orthogonal tRNA competes against one of the most abundant E. coli tRNAs, tRNAArg2. However, the Watson—Crick base pairing facilitated by the orthogonal tRNA (U34/A3) is expected to be extremely favorable compared to the energetically-weak I34/A3 pairing of the E. coli tRNA.39 The high efficiency of reassignment despite high levels of competing tRNA supports the conjecture that the I34/A3 pair is read less efficiently than other wobble interactions. While I34/A3 pairs have been shown to be poorly translated, the quantitative effect of this interaction and the implications for genetic code engineering were not known. The efficiency with which an orthogonal tRNA can compete against the natural I34/A3 interaction suggests that other tRNAs 21 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 46

specifically designed to improve codon-anticodon interactions are potentially fruitful facilitators of sense codon reassignment. Discrimination between Targeted and Non-targeted Codons by Orthogonal tRNAs The extent to which a particular targeted codon as opposed to closely-related codons are decoded by the introduced orthogonal machinery is an important consideration for genetic code expansion. The potential advantage of orthogonal pair-directed sense codon reassignment over methods that enable incorporation of non-canonical amino acids through either editing-defective aaRSs or residue specific reassignment strategies is that specific codons as opposed to amino acids may be targeted for reassignment. Orthogonal tRNAs targeting the Phe, Asn, Asp, Cys, and Ser 2-box wobble codons strongly discriminate between C- and U-ending codons. The introduced tRNAs reassign the targeted Uending codon with efficiencies between 3.2 ± 0.3% and 9.1 ± 2.6%. Reassignment of the synonymous non-targeted C-ending codon is at or below the detection limit of the in cell assay, 0.15%. A more sensitive evaluation of purified proteins showed that M. jannaschii tRNAOptAAA reassigned the targeted Phe UUU and non-targeted Phe UUC codons at a ratio of 94:6, the same minimum discrimination as is determined for 3.2 ± 0.3% reassignment of UUU and reassignment of UUC at or below the detection limit of the in cell assay.5 Based on the in cell fluorescencebased screen, discrimination between targeted and non-targeted Asp, Asn, Cys, and Ser 2-box E. coli wobble codons by the orthogonal tRNAs is at least 95:5. tRNAOptACU for reassigning Ser AGU is a particularly promising candidate for sense codon reassignment to ncAAs, as this nonoptimized, anticodon-altered tRNA exhibits both a high base level reassignment efficiency of 9.1 ± 2.6% and excellent discrimination of at least 98:2.

22 ACS Paragon Plus Environment

Page 23 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The orthogonal tRNA variants with ANG anticodons targeting Leu CUU, Pro CCU, and His CAU codons show particularly poor discrimination ratios between the targeted U-ending and non-targeted C-ending codons. Reassignment of the non-targeted C-ending codon accounts for at least 24% of the amino acid incorporations facilitated by the orthogonal tRNAs (Table 1). Partial inosine modification of A34 of the orthogonal tRNAs akin to the modification of tRNAArg2ACG, the only E. coli tRNA transcribed as A34, would explain the poor codon discrimination observed for tRNAOptAAG, tRNAOptAGG, and tRNAOptAUG, as I34/U3 and I34/C3 pairings are energetically favorable and comparable to A34/U3 pairings. The loop sequence surrounding the anticodons of M. jannaschii tRNAOpt and E. coli tRNAArg2 are identical. In the case of the three orthogonal tRNA variants with ANG anticodons, the anticodon loop differs from E. coli tRNAArg2ICG at only position 35, the central anticodon position. Although the recognition elements for E. coli TadA, the enzyme responsible for modifying A34 of tRNAArg2 to inosine, have not yet been fully mapped, the anticodon loop has been shown to be critical.37 The lack of codon discrimination by M. jannaschii tRNAs with ANG anticodons suggests that the position 35 variants remain substrates for E. coli TadA. We previously reported that the anomalous lack of discrimination between synonymous histidine CAU and CAC codons by M. jannaschii tRNAOptAUG was the result of unexpected modification of A34 to inosine.38 Screening a focused anticodon loop library allowed identification of M. jannaschii tRNAOptAUG variants that are not TadA substrates and which display greatly improved discrimination between the two histidine codons.38 A similar strategy could presumably be employed to improve codon discrimination by tRNAOptAAG and tRNAOptAGG. The unexpected decoding of C-ending codons by some A34 M. jannaschii tRNAs suggests that the in vivo activity of TadA is broader than has been previously described. Eukaryotes have 23 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 46

a single enzyme responsible for inosine modification of their eight A34 tRNA species. Eukaryotes utilize I34 and C34 tRNAs in codon 4-boxes, with the exception of the glycine 4-box which includes both U34 and G34 tRNAs, but neither A/I34 nor C34 tRNAs. The eukaryotic complement of tRNAs stands in stark contrast to that of E. coli, which utilizes G34, U34, and usually C34 tRNAs to decode 4-boxes. The complement of prokaryotic tRNAs includes only a single tRNA with an A34 anticodon. E. coli TadA may not need to be specific, and residual activity that allows it to function on A34 tRNAs similar to those found in eukaryotes may be present. For all codon boxes which parallel eukaryotic utilization of an I34 tRNA and the glycine 4-box, the engineered M. jannaschii tRNA with an A34 anticodon shows detectable decoding of the non-targeted, C-ending codon. The measurements that are made using the fluorescence-based screen are as or more sensitive than those of other in vitro techniques used to identify inosine modification.37 The limit of identification of inosine modification varies as a result of the incorporation efficiency of the tRNA at the targeted codon. For the least efficient reassigning tRNAs (e.g. 1.0% efficiency), greater than 20% inosine modification would be required to raise the reassignment of the Cending codon above the limit of detection of the in cell fluorescence-based screen. For more efficient reassigning tRNAs (e.g. 20.0% efficiency), an approximate 1% inosine modification would be detectable with the in cell fluorescence-based screen. E. coli does not have tRNAs that form Watson—Crick base pairs with four G-ending codons. Orthogonal tRNAs with C34 anticodons were introduced in order to target these codons for reassignment. The wobble rules suggest that C34 tRNAs should only decode G-ending codons. Decoding of the Lys AAA codon by tRNAOptCUU is below detection both in cells and using a more stringent purified protein assay.5 Decoding of the Glu GAA codon by tRNAOptCUC is 24 ACS Paragon Plus Environment

Page 25 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

detectable, but given that the introduced tRNA reads the targeted Glu GAG codon at nearly 20%, the discrimination between G- and A-ending codons is still 96:4. The discrimination between Gand A-ending Val and Ala codons is less effective. In both cases, the orthogonal C34 tRNA unexpectedly reads multiple codons, including the U-ending codons, at levels above the detection limit of the in cell assay. As with other cases in which anomalous decoding has been observed, tRNA modification is a possibility. It is not clear what tRNA modification would lead to the observed decoding by the introduced Val GUG and Ala GCG targeting tRNAs. The C34 modification that allows reading of A-ending codons, lysidine, only occurs in the tRNA that reads the Ile AUA codon. The lysidine modification restricts decoding to the A-ending codon and eliminates the reading of the G-ending codon to prevent incorporation of isoleucine at methionine AUG codons. Based on the reported recognition elements of the modification enzyme, partial lysidine modification is not anticipated.62, 63 Another possible explanation of the anomalous discrimination is the “two out of three” decoding hypothesis which suggests that codons that contain strong G/C pairs in the first 2 codon positions can be decoded by tRNAs with apparent “mismatches” at the wobble position.64, 65 This decoding mechanism has been hypothesized and observed in vitro, but since the four codon boxes in which this mechanism is most likely active are 4-boxes (Pro CCN, Ala GCN, Arg CGN, Gly GGN), confirmation of the possible promiscuity of the tRNAs has not been observable in vivo. The “two out of three” decoding hypothesis could explain the lower than expected discrimination observed by orthogonal tRNAs targeting codons in these boxes.

25 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 46

The final set of codons read via wobble iterations in the E. coli genetic code are the arginine CGC and CGA codons. The CGC codon is the only C-ending codon for which a tRNA with a G (or modified G) in the first anticodon position does not exist. Similarly, CGA is the only Aending codon for which a tRNA with a U (or modified U) in the first anticodon position does not exist. The E. coli tRNA which decodes these two codons has inosine at position 34. The M. jannaschii tRNAOptUCG decodes the targeted CGA codon with nearly 41% efficiency, the highest efficiency of the E. coli wobble codons evaluated. Given the strength of G/U wobble interactions, the observation that expression of the M. jannaschii tRNAOptUCG decodes both the Arg CGA and Arg CGG codons is not surprising. The U34/G3 wobble allows tRNAOptUCG to reassign the CGG codon with 33.0 ± 1.7% efficiency. tRNAOptUCG also decodes the CGU and CGC codons at levels that, while low, are above background. This is surprising given that the competing inosine-modified E. coli Arg2 tRNA is among the most abundant tRNAs in the cell; I34/U3 and I34/C3 base pairing is expected to be much stronger than either U34/U3 or U34/C3 pairing. Expression of M. jannaschii tRNAOptGCG decodes both the targeted CGC and nontargeted CGU codons with nearly identical efficiency, approximately 7.5%, as is expected for G34/C3 Watson—Crick and G34/U3 wobble decoding. This tRNA has no detectable reassignment of either the CGA or CGG codons. E. coli are Robust to Proteome-Wide Amino Acid Substitutions The orthogonal pair directed, partial genetic code reassignments described here enable codonspecific amino acid substitutions across all cellular proteins and offer a precise tool to investigate the plasticity of the genetic code and mutational robustness of proteins. The extent to which cells tolerate proteome-wide reassignments has not been widely investigated. Missense errors are

26 ACS Paragon Plus Environment

Page 27 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

thought to be generally destabilizing, but translation errors introduced through defective aaRS editing or antibiotic treatment are broadly tolerated by cells. Estimates of mutational robustness from bioinformatics, computational model systems, and experiments on model proteins suggest that mutational robustness is related to maintenance of protein folds and that proteins are generally tolerant to single mutations.66-68 Cells show growth defects, but remain able to grow and divide with 10-18% substitution of valine by the ncAA aminobutyric acid.30 Cells harboring several mutant aaRSs enabling 2-20% proteome-wide reassignments of Pro by Cys, Gln by Glu, Asn by Asp, and Thr by Ser show growth defects only when the protein quality control machinery is additionally attenuated.34 We examined the extent to which proteome-wide, codon-specific tyrosine substitutions were tolerated by E. coli by calculating relative system fitnesses using the measured instantaneous doubling times of each reassignment system (Table 3, Figure 4, Figure S6). The non-reassigning control for growth comparisons included the wild type GFP reporter vector and a translation machinery vector with the orthogonal M. jannaschii aaRS but no tRNA. The baseline system for cell growth differed slightly from the system used to calculate reassignment efficiencies; the 100% fluorescent control for reassignment efficiency calculations included the wild type GFP reporter vector and a translation machinery vector with the orthogonal M. jannaschii aaRS and an amber suppressing tRNA. The difference between these two reference systems was small: ~4% different in total fluorescence per cell and 10% different growth rate. A comparison of additional systems to evaluate the contribution of antibiotic burden and protein expression to growth rate is included in the Supporting Information (Figure S7). The average doubling time of the non-reassigning control cells calculated from 18 growth curves across several experiments is 29.2 ± 1.8 minutes, indicated by the solid line and blue shading; the pale shading represents one

27 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 46

additional standard deviation in Figure 4. All of the sense codon reassignment systems evaluated exhibit instantaneous doubling times that exceed 2 standard deviations of the non-reassigning reference cells. The ratio of doubling times between each reassignment system and the nonreassigning control was used as a measure of the cellular fitness reductions induced by sense codon reassignment (Table 3).69 Describing the growth rate reductions as relative fitness enables comparison of the effects of induced missense errors to other types of cellular function perturbation, such as antibiotic challenge and gene deletion. The reductions in E. coli fitness that result from partial genetic code reassignments were compared to the E. coli fitness reductions that result from single and multiple transposon-mediated gene disruptions.70, 71 Random transposon gene disruptions were found to be universally detrimental; each of the 226 transposon insertions analyzed imposed a fitness penalty.71 The effect of combinations of gene deletions appeared, on average, to be additive, but this effect was the result of approximately equal amounts of positive and negative epistasis.70 The fitness effects that are observed as a result of sense codon reassignment approximate the effect of between 2 and 17 non-essential gene disruptions; the average fitness reduction is equivalent to the disruption of approximately 7.5 non-essential genes. For another point of comparison, the average fitness reduction for sense codon reassigning systems is 2.5x the system fitness reduction imposed by using two antibiotics to maintain two different protein expressing plasmids. Table 3. Cell health and growth profiles for sense codon reassigning systems tRNA Anticodon

Sense Codon Targeted

tRNAOptAAA tRNAOptAAG

Phe UUU Leu CUU

Instantaneous Doubling Time (min) 36.9 ± 1.7 36.1 ± 1.7

Relative System Fitness 0.79 ± 0.06 0.81 ± 0.06

Estimated Number of Proteome-Wide Substitutions per Cell Generation 6.51E+05 4.28E+05

28 ACS Paragon Plus Environment

Page 29 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

tRNAOptAAU tRNAOptAAC tRNAOptAGA tRNAOptAGG tRNAOptAGU tRNAOptAGC tRNAOptAUG tRNAOptAUU tRNAOptAUC tRNAOptACA tRNAOptACU tRNAOptACC

Ile AUU Val GUU Ser UCU Pro CCU Thr ACU Ala GCU His CAU Asn AAU Asp GAU Cys UGU Ser AGU Gly GGU

36.9 ± 2.6 37.9 ± 3.1 37.3 ± 2.8 38.1 ± 2.6 39.8 ± 3.6 41.2 ± 3.9 35.3 ± 4.3 34.3 ± 2.0 36.2 ± 3.8 35.8 ± 2.2 39.2 ± 1.9 38.1 ± 2.6

0.79 ± 0.07 0.77 ± 0.08 0.78 ± 0.08 0.77 ± 0.07 0.73 ± 0.08 0.71 ± 0.08 0.83 ± 0.11 0.85 ± 0.07 0.81 ± 0.10 0.82 ± 0.07 0.75 ± 0.06 0.77 ± 0.07

1.42E+06 4.95E+05 4.58E+06 1.09E+06 3.12E+06 7.89E+05 8.16E+05 1.28E+06 1.41E+06 1.57E+05 6.04E+05 7.21E+05

tRNAOptCAC tRNAOptCGC tRNAOptCUU tRNAOptCUC

Val GUG Ala GCG Lys AAG Glu GAG

43.5 ± 3.9 58.3 ± 6.8 33.5 ± 3.9 39.9 ± 3.2

0.67 ± 0.07 0.50 ± 0.07 0.87 ± 0.11 0.73 ± 0.07

6.02E+05 3.66E+06 2.75E+06 5.55E+06

tRNAOptGCG Arg CGC 36.8 ± 1.8 0.79 ± 0.06 3.16E+06 tRNAOptUCG Arg CGA 36.6 ± 2.8 0.80 ± 0.08 1.05E+06 No tRNA wild type GFP 29.2 ± 1.8 1.00 ± 0.06 Analysis of at least 6 biological replicates contributed to determination of the instantaneous doubling times for each sense codon reassigning system. The number of replicates used for each system are listed in Table 2. 18 replicates were measured for the non-reassigning control (reporter vector: wild type GFP; translation machinery vector: M. jannaschii Tyr aaRS and no tRNA gene.

Figure 4. Average instantaneous doubling time of each sense codon reassigning system plotted against the sense codon reassignment efficiency for that system. The line at 29.2 minutes represents the instantaneous doubling time for the non-reassigning, wild type GFP 29 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 46

culture (GFP with a Tyr codon at position 66, translation machinery vector without a tRNA gene). The blue shading above and below the unbroken line at 29.2 minutes represents the standard deviation (± 1.8 minutes) of the instantaneous doubling time for the non-reassigning, wild type GFP control. The pale shading represents the second standard deviation for the nonreassigning, wild type GFP control. The inset expands those systems which exhibit sense codon reassignment efficiencies up to 9% with instantaneous doubling times of less than 40 minutes. Estimates for the numbers of proteome-wide changes occurring in response to directed sense codon reassignments in the course of a single cell generation (Table 3) combined data for the numbers of proteins per cell (5.5 x 106), the average size of an E. coli protein (300 amino acids), the relative abundance of “highly expressed” and “less expressed” genes, the codon usage frequency in each set of genes, and the measured sense codon reassignment efficiencies.61, 72-75 The top 100 proteins are arbitrarily termed the highly expressed fraction, and the remaining 3700 proteins constitute the less expressed fraction. The fraction of amino acids in highly expressed vs less expressed proteins is derived from evaluating the percentage of proteins in each group from ribosome profiling experiments, which suggest that the 100 most frequently translated E. coli proteins account for 72% of the total number of cellular proteins.75 Codon usage sets were taken from the GenBank codon usage database for E. coli W3110; the codon usage was broken up between highly expressed genes, which have high codon adaptation scores, and other genes.74 The codon table used to calculate the number of codons in highly expressed genes was based on the codon usage in ribosomal proteins, which represent 55 of the top 100 expressed proteins.75 The codon usage frequencies of the less expressed proteins use the codon table derived from including all E. coli W3110 open reading frames. Combining the codon usage frequency and proportions of the highly and less expressed protein fractions enables a calculation of the number of times each sense codon is translated in a cellular generation. This number, multiplied by the sense codon reassignment efficiency estimates the number of induced missense substitutions occurring at a given codon in each cellular generation.

30 ACS Paragon Plus Environment

Page 31 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Codon-specific, directed sense codon reassignment produces on the order of 100,000s to millions of missense incorporations beyond the number that result from normal cellular background missense incorporation. The normal cellular missense incorporation rate is generally estimated as 1 missense error in 1000 to 10,000 amino acid incorporations.21, 76 On average, induced missense systems lead to a 100% to 1000% increase in the number of missense errors occurring in the course of single cell generation. The fitness effects of 100,000s to millions of tyrosine substitutions across the proteome appear to be equivalent to an average of 7.5 non-essential gene deletions, suggesting that cells are broadly tolerant to missense errors. The observed fitness reductions in sense codon reassignment systems should not strongly impact the biotechnological applications of cells with partial genetic code reassignments. E. coli codon usage does not appear to limit the choice of targets for sense codon reassignment, as there is only a weak correlation between the estimated number of proteome-wide replacements and system fitness. CONCLUSIONS The extent to which the degeneracy of the genetic code can be broken has not been systematically investigated, and the 20 quantitative measurements described here greatly increases the number of specific genetic code modifications reported. The extreme evolutionary conservation of the genetic code is in sharp contrast to the apparent ease with which it can be experimentally manipulated.77, 78 Each E. coli wobble codon evaluated can be reassigned to tyrosine to some extent, with efficiencies between 0.8% and 41%. The apparent ease with which E. coli accommodates very large numbers of proteome-wide reassignments is at odds with individual protein level experiments that suggest that substitutions are largely destabilizing.

31 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 46

Further quantitative evaluations of proteome-wide tyrosine substitutions and concomitant cellular fitness responses are currently underway. In addition to revealing promising, previously unconsidered, sense codon targets for future non-canonical amino acid incorporation using the M. jannaschii orthogonal pair, the evaluation of sense codon reassignment provides a unique data set that contributes to further unraveling the in vivo importance of tRNA abundance, aminoacylation level, elongation factor binding efficiency, tRNA modifications, and codon-anticodon interaction energy in determining translational fidelity. The translation process is highly balanced, and multiple factors influence the extent to which specific codons can be reassigned. Orthogonal pair directed sense codon reassignment provides a platform with which to quantify particular interactions that have been previously difficult to evaluate. Highly efficient reassignment of the Arg CGA codon despite competition with one of the most abundant E. coli tRNAs suggests that codon-anticodon interactions can play an important role in determining which aminoacylated tRNA is utilized to decode a particular mRNA codon. I34/A3 pairs, like that required to decode the CGA codon in E. coli, have been shown to be poorly translated,39 but the quantitative effect of this interaction and the implications for genetic code engineering were not known. The strategy of employing an orthogonal pair to reassign sense codons coupled with a screen that allows quantitative evaluation of a specific amino acid substitution should be generalizable to other orthogonal pairs and other screens. The chemical properties of the side chain of the amino acid incorporated throughout the proteome would be expected to influence the overall fitness costs of substituting one amino acid for another, and this evaluation of tyrosine substitutions represents just one possible orthogonal pair directed study of protein translation. Orthogonal pairs derived from multiple natural amino acid aaRSs have been described, but not 32 ACS Paragon Plus Environment

Page 33 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

widely utilized.79-85 Multiple screens that sensitively evaluate specific amino acid substitutions to evaluate background levels of missense incorporation have been described.21, 34 Combining the orthogonal pairs and specific screens should enable the generation of a large set of alternative genetic codes to explore the fitness costs of specific amino acid substitutions at the codon level, providing an experimental perspective on the costs of specific mutations that have largely been inferred from bioinformatics approaches. As the biotechnological applications of systems with expanded genetic codes continue to expand in scope and complexity, a quantitative, holistic understanding of the biological interactions that govern protein translation will foster the rational design of orthogonal translation components that behave in predictable ways. In addition to enabling the incorporation of non-canonical amino acids for genetic code expansion and biotechnological applications, orthogonal pair directed sense codon reassignment is a broadly applicable tool to evaluate the factors affecting translational efficiency in the context of wholly functional translation systems in living cells. The 20 measurements described here represent a beginning, and we expect that the broad application of using orthogonal pairs to incorporate natural amino acids in response to atypical codons will generate an increasingly large data set of programed reassignments to enable increasingly focused quantitative dissection of the factors affecting translational fidelity.

33 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 46

ASSOCIATED CONTENT Supporting Information. The following file is available free of charge. Methods for vector construction and mutagenesis, measurement of system growth and fluorescence, calculation of sense codon reassignment efficiencies, and calculation of system fitness are provided (PDF). AUTHOR INFORMATION Corresponding Author * To whom correspondence should be addressed. Tel: 1-970-491-4115 (office); Fax: 1-970-4917369; Email: [email protected] ORCID John D. (Nick) Fisk: 0000-0001-9809-6140 Funding Sources This work was supported by the National Science Foundation [NSF 1057055 to J.D.F.]. Notes The authors declare no competing financial interests.

34 ACS Paragon Plus Environment

Page 35 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

REFERENCES 1.

Drummond, D. A., and Wilke, C. O. (2008) Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution, Cell 134, 341-352.

2.

Lajoie, M. J., Söll, D., and Church, G. M. (2016) Overcoming Challenges in Engineering the Genetic Code, J. Mol. Biol. 428, 1004-1021.

3.

Liu, C. C., and Schultz, P. G. (2010) Adding New Chemistries to the Genetic Code, Annu. Rev. Biochem. 79, 413-444.

4.

Wang, L., Brock, A., Herberich, B., and Schultz, P. G. (2001) Expanding the genetic code of Escherichia coli, Science 292, 498-500.

5.

Biddle, W., Schmitt, M. A., and Fisk, J. D. (2015) Evaluating Sense Codon Reassignment with a Simple Fluorescence Screen, Biochemistry 54, 7355-7364.

6.

Agris, P. F. (2004) Decoding the genome: a modified view, Nucleic Acids Res. 32, 223238.

7.

Koonin, E. V., and Novozhilov, A. S. (2009) Origin and Evolution of the Genetic Code: The Universal Enigma, IUBMB Life 61, 99-111.

8.

Plotkin, J. B., and Kudla, G. (2011) Synonymous but not the same: the causes and consequences of codon bias, Nat. Rev. Genet. 12, 32-42.

9.

Knight, R. D., Freeland, S. J., and Landweber, L. F. (2001) Rewiring the keyboard evolvability of the genetic code, Nat. Rev. Genet. 2, 49-58.

10.

Moura, A., Savageau, M. A., and Alves, R. (2013) Relative Amino Acid Composition Signatures of Organisms and Environments, PLoS One 8, e77319.

11.

Kwon, I., Kirshenbaum, K., and Tirrell, D. A. (2003) Breaking the degeneracy of the genetic code, J. Am. Chem. Soc. 125, 7512-7513.

12.

Meroueh, M., and Chow, C. S. (1999) Thermodynamics of RNA hairpins containing single internal mismatches, Nucleic Acids Res. 27, 1118-1125.

13.

Bohlke, N., and Budisa, N. (2014) Sense codon emancipation for proteome-wide incorporation of noncanonical amino acids: rare isoleucine codon AUA as a target for genetic code expansion, FEMS Microbiol. Lett. 351, 133-144.

14.

Ho, J. M., Reynolds, N. M., Rivera, K., Connolly, M., Guo, L.-T., Ling, J., Pappin, D. J., Church, G. M., and Söll, D. (2016) Efficient Reassignment of a Frequent Serine Codon in Wild-Type Escherichia coli, ACS Synth Biol. 5, 163-171.

15.

Kwon, I., and Choi, E. S. (2016) Forced Ambiguity of the Leucine Codons for MultipleSite-Specific Incorporation of a Noncanonical Amino Acid, PLoS ONE 11, e0152826.

16.

Lee, B. S., Shin, S., Jeon, J. Y., Jang, K.-S., Lee, B. Y., Choi, S., and Yoo, T. H. (2015) Incorporation of Unnatural Amino Acids in Response to the AGG Codon, ACS Chem. Biol. 10, 1648-1653.

17.

Mukai, T., Yamaguchi, A., Ohtake, K., Takahashi, M., Hayashi, A., Iraha, F., Kira, S., Yanagisawa, T., Yokoyama, S., Hoshi, H., Kobayashi, T., and Sakamoto, K. (2015) 35 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 46

Reassignment of a rare sense codon to a non-canonical amino acid in Escherichia coli, Nucleic Acids Res. 43, 8111-8122. 18.

Elliott, T. S., Townsley, F. M., Bianco, A., Ernst, R. J., Sachdeva, A., Elsasser, S. J., Davis, L., Lang, K., Pisa, R., Greiss, S., Lilley, K. S., and Chin, J. W. (2014) Proteome labeling and protein identification in specific tissues and at specific developmental stages in an animal, Nat. Biotechnol. 32, 465-472.

19.

Zeng, Y., Wang, W., and Liu, W. S. R. (2014) Towards Reassigning the Rare AGG Codon in Escherichia coli, Chembiochem 15, 1750-1754.

20.

Fluitt, A., Pienaar, E., and Vijoen, H. (2007) Ribosome kinetics and aa-tRNA competition determine rate and fidelity of peptide synthesis, Comput. Biol. Chem. 31, 335-346.

21.

Kramer, E. B., and Farabaugh, P. J. (2007) The frequency of translational misreading errors in E. coli is largely determined by tRNA competition, RNA 13, 87-96.

22.

Dieterich, D.C., Link, A.J., Graumann, J., Tirrell, D.A., and Schuman, E.M. (2007). Selective identification of newly synthesized proteins in mammalian cells using bioorthogonal noncanonical amino acid tagging (BONCAT). Proc. Natl. Acad. Sci. U. S. A. 103, 9482-9487.

23.

Beatty, K. E., Fisk, J. D., Smart, B. P., Lu, Y. Y., Szychowski, J., Hangauer, M. J., Baskin, J. M., Bertozzi, C. R., and Tirrell, D. A. (2010) Live-Cell Imaging of Cellular Proteins by a Strain-Promoted Azide-Alkyne Cycloaddition, Chembiochem 11, 20922095.

24.

Elliott, T. S., Bianco, A., Townsley, Fiona, F. M., Fried, S. D., and Chin, J. W. (2016) Tagging and Enriching Proteins Enables Cell-Specific Proteomics, Cell Chem. Biol. 23, 805-815.

25.

Schimmel, P., and Guo, M. (2009) A tipping point for mistranslation and disease, Nat. Struct. Mol. Biol. 16, 348-349.

26.

Kirchner, S., and Ignatova, Z. (2015) Emerging roles of tRNA in adaptive translation, signalling dynamics and disease, Nat. Rev. Genet. 16, 98-112.

27.

Bacher, J. A., Waas, W. F., Metzgar, D., de Crécy-Lagard, V., and Schimmel, P. (2007) Genetic code ambiguity confers a selective advantage on Acinetobacter baylyi, J. Bacteriol. 189, 6494-6496.

28.

Bacher, J. M., Bull, J. J., and Ellington, A. D. (2003) Evolution of phage with chemically ambiguous proteomes, BMC Evol. Biol. 3, 24.

29.

Bacher, J. M., and Ellington, A. D. (2001) Selection and characterization of Escherichia coli variants capable of growth on an otherwise toxic tryptophan analogue, J. Bacteriol. 183, 5414-5425.

30.

Doring, V., Mootz, H. D., Nangle, L. A., Hendrickson, T. L., de Crécy -Lagard, V., Schimmel, P., and Marliere, P. (2001) Enlarging the amino acid set of Escherichia coli by infiltration of the valine coding pathway, Science 292, 501-504.

36 ACS Paragon Plus Environment

Page 37 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

31.

Pezo, V., Metzgar, D., Hendrickson, T. L., Waas, W. F., Hazebrouck, S., Doring, V., Marliere, P., Schimmel, P., and de Crécy -Lagard, V. (2004) Artificially ambiguous genetic code confers growth yield advantage, Proc. Natl. Acad. Sci. U. S. A. 101, 85938597.

32.

Nangle, L. A., Lagardt, V. D., Dodring, V., and Schimmel, P. (2002) Genetic code ambiguity - Cell viability related to severity of editing defects in mutant tRNA synthetases, J. Biol. Chem. 277, 45729-45733.

33.

Nangle, L. A., Motta, C. M., and Schimmel, P. (2006) Global effects of mistranslation from an editing defect in mammalian cells, Chem. Biol. 13, 1091-1100.

34.

Ruan, B. F., Palioura, S., Sabina, J., Marvin-Guy, L., Kochhar, S., LaRossa, R. A., and Söll, D. (2008) Quality control despite mistranslation caused by an ambiguous genetic code, Proc. Natl. Acad. Sci. U. S. A. 105, 16502-16507.

35.

Andersson, D. I., Vanverseveld, H. W., Stouthamer, A. H., and Kurland, C. G. (1986) Suboptimal Growth with Hyper-accurate Ribosomes, Arch. Microbiol. 144, 96-101.

36.

Gromadski, K. B., Daviter, T., and Rodnina, M. V. (2006) A uniform response to mismatches in codon-anticodon complexes ensures ribosomal fidelity, Mol. Cell 21, 369377.

37.

Wolf, J., Gerber, A. P., and Keller, W. (2002) tadA, an essential tRNA-specific adenosine deaminase from Escherichia coli, EMBO J. 21, 3841-3851.

38.

Biddle, W., Schmitt, M. A., and Fisk, J. D. (2016) Modification of orthogonal tRNAs: unexpected consequences for sense codon reassignment, Nucleic Acids Res. 44, 1004210050.

39.

Curran, J. F. (1995) Decoding with the A-I Wobble Pair is Inefficient, Nucleic Acids Res. 23, 683-688.

40.

Crick, F. H. C. (1966) Codon-Anticodon Pairing – Wobble Hypothesis, J. Mol. Biol. 19, 548-555.

41.

Lim, V. I. (1995) Analysis of Action of the Wobble Adenine on Codon Reading within the Ribosome, J. Mol. Biol. 252, 277-282.

42.

Murphy, F. V., and Ramakrishnan, V. (2004) Structure of a purine-purine wobble base pair in the decoding center of the ribosome, Nat. Struct. Mol. Biol. 11, 11251-11252

43.

Aldinger, C. A., Leisinger, A. K., Gaston, K. W., Limbach, P. A., and Igloi, G. L. (2012) The absence of A-to-I editing in the anticodon of plant cytoplasmic tRNA(ACG)(Arg) demands a relaxation of the wobble decoding rules, RNA Biol. 9, 1239-1246.

44.

Yokobori, S., Kitamura, A., Grosjean, H., and Bessho, Y. (2013) Life without tRNA(Arg)-adenosine deaminase TadA: evolutionary consequences of decoding the four CGN codons as arginine in Mycoplasmas and other Mollicutes, Nucleic Acids Res. 41, 6531-6543.

45.

Beuning, P. J., and Musier-Forsyth, K. (1999) Transfer RNA recognition by aminoacyltRNA synthetases, Biopolymers 52, 1-28.

37 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 46

46.

Giegé, R., Sissler, M., and Florentz, C. (1998) Universal rules and idiosyncratic features in tRNA identity, Nucleic Acids Res. 26, 5017-5035.

47.

Normanly, J., and Abelson, J. (1989) Transfer RNA Identity, Annu. Rev. Biochem. 58, 1029-1049.

48.

McClain, W. H., Chen, Y. M., Foss, K., and Schneider, J. (1988) Association Of Transfer-RNA Acceptor Identity with a Helical Irregularity, Science 242, 1681-1684.

49.

Fechter, P., Rudinger-Thirion, J., Tukalo, M., and Giegé, R. (2001) Major tyrosine identity determinants in Methanococcus jannaschii and Saccharomyces cerevisiae tRNA(Tyr) conserved but expressed differently, Eur. J. Biochem. 268, 761-767.

50.

Young, T. S., Ahmad, I., Yin, J. A., and Schultz, P. G. (2010) An Enhanced System for Unnatural Amino Acid Mutagenesis in E. coli, J. Mol. Biol. 395, 361-374.

51.

Wang, L., and Schultz, P. G. (2001) A general approach for the generation of orthogonal tRNAs, Chem. Biol. 8, 883-890.

52.

Rezgui, V. A. N., Tyagi, K., Ranjan, N., Konevega, A. L., Mittelstaet, J., Rodnina, M. V., Peter, M., and Pedrioli, P. G. A. (2013) tRNA tK(UUU), tQ(UUG), and tE(UUC) wobble position modifications fine-tune protein translation by promoting ribosome A-site binding, Proc. Natl. Acad. Sci. U. S. A. 110, 12289-12294.

53.

Agris, P. F. (2008) Bringing order to translation: the contributions of transfer RNA anticodon-domain modifications, EMBO Rep. 9, 629-635.

54.

Yarus, M. (1982) Translational Efficiency of Transfer RNAs Uses of an Extended Anticodon, Science 218, 646-652.

55.

Gromadski, K. B., and Rodnina, M. V. (2004) Kinetic determinants of high-fidelity tRNA discrimination on the ribosome, Mol. Cell 13, 191-200.

56.

LaRiviere, F. J., Wolfson, A. D., and Uhlenbeck, O. C. (2001) Uniform binding of aminoacyl-tRNAs to elongation factor Tu by thermodynamic compensation, Science 294, 165-168.

57.

Dale, T., Sanderson, L. E., and Uhlenbeck, O. C. (2004) The affinity of elongation factor Tu for an aminoacyl-tRNA is modulated by the esterified amino acid, Biochemistry 43, 6159-6166.

58.

Phelps, S. S., Jerinic, O., and Joseph, S. (2002) Universally conserved interactions between the ribosome and the anticodon stem-loop of a site tRNA important for translocation, Mol. Cell 10, 799-807.

59.

Olejniczak, M., Dale, T., Fahlman, R. P., and Uhlenbeck, O. C. (2005) Idiosyncratic tuning of tRNAs to achieve uniform ribosome binding, Nat. Struct. Mol. Biol. 12, 788793.

60.

Dong, H. J., Nilsson, L., and Kurland, C. G. (1996) Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates, J. Mol. Biol. 260, 649-663.

61.

Bremer, H., and Dennis, P. P. (1999) Modulation of Chemical Composition and Other Parameters of the Cell by Growth Rate, In Escherichia coli and Salmonella cellular and

38 ACS Paragon Plus Environment

Page 39 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

molecular biology (Neidhardt, F. C., and Curtiss, R., Eds.), ASM Press, Washington, D.C. 62.

Suzuki, T., and Miyauchi, K. (2010) Discovery and characterization of tRNA(Ile) lysidine synthetase (TilS), FEBS Lett. 584, 272-277.

63.

Soma, A., Ikeuchi, Y., Kanemasa, S., Kobayashi, K., Ogasawara, N., Ote, T., Kato, J., Watanabe, K., Sekine, Y., and Suzuki, T. (2003) An RNA-modifying enzyme that governs both the codon and amino acid specificities of isoleucine tRNA, Mol. Cell 12, 689-698.

64.

Lagerkvist, U. (1978) 2 out of F 3 – Alternative Method for Codon Reading, Proc. Natl. Acad. Sci. U. S. A. 75, 1759-1762.

65.

Lehmann, J., and Libchaber, A. (2008) Degeneracy of the genetic code and stability of the base pair at the second position of the anticodon, RNA 14, 1264-1269.

66.

Bloom, J. D., Silberg, J. J., Wilke, C. O., Drummond, D. A., Adami, C., and Arnold, F. H. (2005) Thermodynamic prediction of protein neutrality, Proc. Natl. Acad. Sci. U. S. A. 102, 606-611.

67.

Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O., and Arnold, F. H. (2005) Why highly expressed proteins evolve slowly, Proc. Natl. Acad. Sci. U. S. A. 102, 1433814343.

68.

Guo, H. H., Choe, J., and Loeb, L. A. (2004) Protein tolerance to random amino acid change, Proc. Natl. Acad. Sci. U. S. A. 101, 9205-9210.

69.

Wiser, M. J., and Lenski, R. E. (2015) A Comparison of Methods to Measure Fitness in Escherichia coli, PLoS One 10, e0126210.

70.

Elena, S. F., and Lenski, R. E. (1997) Test of synergistic interactions among deleterious mutations in bacteria, Nature 390, 395-398.

71.

Elena, S. F., Ekunwe, L., Hajela, N., Oden, S. A., and Lenski, R. E. (1998) Distribution of fitness effects caused by random insertion mutations in Escherichia coli, Genetica 102-103, 349-358.

72.

Milo, R. (2013) What is the total number of protein molecules per cell volume? A call to rethink some published values, Bioessays 35, 1050-1055.

73.

Brocchieri, L., and Karlin, S. (2005) Protein length in eukaryotic and prokaryotic proteomes, Nucleic Acids Res. 33, 3390-3400.

74.

Nakamura, Y., Gojobori, T., and Ikemura, T. (2000) Codon usage tabulated from international DNA sequence databases: status for the year 2000, Nucleic Acids Res. 28, 292-292.

75.

Li, G. W., Burkhardt, D., Gross, C., and Weissman, J. S. (2014) Quantifying Absolute Protein Synthesis Rates Reveals Principles Underlying Allocation of Cellular Resources, Cell 157, 624-635.

76.

Parker, J. (1989) Errors and Alternatives in Reading the Universal Genetic Code, Microbiol. Rev. 53, 273-298.

39 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 46

77.

Bacher, J. M., Hughes, R. A., Wong, J. T. F., and Ellington, A. D. (2004) Evolving new genetic codes, Trends Ecol. Evolut. 19, 69-75.

78.

Hughes, R. A., and Ellington, A. D. (2005) Mistakes in translation don't translate into termination, Proc. Natl. Acad. Sci. U. S. A. 102, 1273-1274.

79.

Anderson, J. C., and Schultz, P. G. (2003) Adaptation of an orthogonal archaeal leucyltRNA and synthetase pair for four-base, amber, and opal suppression, Biochemistry 42, 9598-9608.

80.

Chatterjee, A., Xiao, H., and Schultz, P. G. (2012) Evolution of multiple, mutually orthogonal prolyl-tRNA synthetase/tRNA pairs for unnatural amino acid mutagenesis in Escherichia coli, Proc. Natl. Acad. Sci. U. S. A. 109, 14841-14846.

81.

Englert, M., Vargas-Rodriguez, O., Reynolds, N. M., Wang, Y.-S., Söll, D., and Umehara, T. (2017) A genomically modified Escherichia coli strain carrying an orthogonal E. coli histidyl-tRNA synthetase•tRNAHis pair, Biochim. Biophys. Acta, Gen. Subj. 1861, 3009-3115.

82.

Furter, R. (1998) Expansion of the genetic code: Site-directed p-fluoro-phenylalanine incorporation in Escherichia coli, Protein Sci. 7, 419-426.

83.

Hughes, R. A., and Ellington, A. D. (2010) Rational design of an orthogonal tryptophanyl nonsense suppressor tRNA, Nucleic Acids Res. 38, 6813-6830.

84.

Liu, D. R., Magliery, T. J., Pasternak, M., and Schultz, P. G. (1997) Engineering a tRNA and aminoacyl-tRNA synthetase for the site-specific incorporation of unnatural amino acids into proteins in vivo, Proc. Natl. Acad. Sci. U. S. A. 94, 10092-10097.

85.

Santoro, S. W., Anderson, J. C., Lakshman, V., and Schultz, P. G. (2003) An archaebacteria-derived glutamyl-tRNA synthetase and tRNA pair for unnatural amino acid mutagenesis of proteins in Escherichia coli, Nucleic Acids Res. 31, 6700-6709.

40 ACS Paragon Plus Environment

Page 41 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

FOR TABLE OF CONTENTS USE ONLY

41 ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

84x68mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 42 of 46

Page 43 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

270x99mm (300 x 300 DPI)

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

175x69mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 44 of 46

Page 45 of 46 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

175x89mm (300 x 300 DPI)

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

82x44mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 46 of 46