Amino Acid Misincorporation Propensities Revealed Through

39 mins ago - Elevated amino acid misincorporation levels during protein translation can cause disease and adversely impact biopharmaceutical product ...
0 downloads 0 Views 843KB Size
Subscriber access provided by Kaohsiung Medical University

Article

Amino Acid Misincorporation Propensities Revealed Through Systematic Amino Acid Starvation H. Edward Wong, Chung-Jr Huang, and Zhongqi Zhang Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00976 • Publication Date (Web): 12 Nov 2018 Downloaded from http://pubs.acs.org on November 14, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities

1 2

Amino Acid Misincorporation Propensities Revealed Through Systematic Amino Acid Starvation

3 4

H. Edward Wonga, Chung-Jr Huanga,b, Zhongqi Zhanga,*

5 6

aProcess

Development, Amgen, Inc., 1 Amgen Center Drive, Thousand Oaks, California 91320

7 8 9 10

address: Upstream Process Development and Engineering, Biologics Process Development & Clinical Manufacturing, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, NJ 07033, USA.

11 12 13 14 15

*To whom correspondence should be addressed: Zhongqi Zhang; Telephone: Process development, Amgen, Inc., Thousand Oaks, California 91320; [email protected]; Tel. (805) 447-7783.

bPresent

16

1

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

Abbreviations

2

CHO, Chinese hamster ovary; tRNA, transfer ribonucleic acid; mRNA, messenger ribonucleic acid;

3

aaRS, aminoacyl-tRNA synthetase; mAb, monoclonal antibody; IgG, immunoglobulin gamma; DTT,

4

dithiothreitol; LC, liquid chromatography; UPLC, ultraperformance liquid chromatography; UV,

5

ultraviolet; MS, mass spectrometry; MS/MS, tandem mass spectrometry;

2

ACS Paragon Plus Environment

Page 2 of 38

Page 3 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

ABSTRACT

2

Elevated amino acid misincorporation levels during protein translation can cause disease and

3

adversely impact biopharmaceutical product quality. Our previous work, along with that of others,

4

identified numerous low-level unintended sequence variants. However, due to the limited analytical

5

detection efficiency, we believed that these observations represented only a fraction of biologically relevant

6

outcomes. Since amino acid misincorporation can be exacerbated by amino acid starvation, we believed

7

that a more comprehensive set of sequence variants could be derived through systematic starvation. Our

8

goal for this study was therefore 1) to systematically characterize misincorporation patterns under amino

9

acid starvation, and 2) to elucidate the major misincorporation mechanisms and propensities for cultured

10

mammalian cells. To the best of our knowledge this is the first study to use controlled systematic starvation

11

to maximize the observation of unique sequence variants, in order to provide a more holistic perspective of

12

amino acid misincorporation. Our findings bridge the two prevailing lines of research and propose that both

13

base mismatches during codon recognition (especially G/U and wobble mismatches), and misacylation are

14

common and major amino acid misincorporation mechanisms. This proposal is also supported by the

15

observation of mechanistic additivity between the base mismatch and misacylation mechanisms. In

16

addition, we observed significant overlap in misincorporation mechanisms and propensities among cell

17

lines and organisms. Lastly, we explored factors that can lead to codon-associated misincorporation

18

behavior.

19 20

Keywords: Amino acid misincorporation, sequence variants, codon-anticodon mispairing, wobble,

21

misacylation, base mismatching, CHO, E. coli, starvation, mechanistic additivity, codon-associated

22

misincorporation behavior

23

3

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

Amino acid misincorporation propensities 1

INTRODUCTION

2

Proteins are functional biomacromolecules that serve as structural, enzymatic, signaling, and

3

transport units in cells. Many metabolic, biosynthetic, and cell processes can influence protein product

4

attributes during translation and post-translational modification (i.e. sequence variation, glycosylation,

5

oxidation, proteolysis, etc.). For both recombinant and native protein biosynthesis, the resulting protein

6

product attributes represent a historical record of cellular events. Of particular interest are sequence variants

7

that are caused by amino acid misincorporation, which have broad implications.1 Believed to be universal,

8

amino acid misincorporation is the inherent and unintended errant replacement of an amino acid during

9

translation. In biological systems, high-level amino acid misincorporation can lead to the loss of

10

proteostasis and disease through the compounded accumulation of errors.2,3 Within a biopharmaceuticals

11

context, since product attributes define product function, elevated sequence variant levels can adversely

12

cause product heterogeneity, instability, potency change, structural perturbation, aggregation, and

13

potentially immunogenicity.4,5 As a result, the cell line selection process is critical to minimizing sequence

14

variants as a measure to proactively prevent adverse patient outcomes. Each of the amino acid

15

misincorporation mechanisms, occurring after transcription, is driven by competition between the cognate

16

and non-cognate substrates.6-12 First, an aminoacyl-tRNA synthetase (aaRS) can misrecognize a non-

17

cognate amino acid which leads to misacylation.13,14 Additionally, the failure of an aaRS to correctly

18

recognize its cognate tRNA, based on nucleotide sequence and post-transcriptional modifications, results

19

in aminoacylation of a non-cognate tRNA.15 Secondly, non-cognate aminoacyl-tRNA can outcompete

20

cognate aminoacyl-tRNA for ribosomal entry to yield faulty codon recognition through base mismatching.12

21

While substrate competition provides a driving force for mistranslation, structure and shape

22

complementarity, within the aaRS active site and ribosome decoding center often facilitates error

23

discrimination.16 The ribosome decoding center sterically prevents non-canonical base mispairing

24

geometries at the first and second codon positions.16 For aaRSs, fidelity is modulated through active-site

25

specificity, and in some cases, pre- and/or post-transfer editing mechanisms.17,18 It is by virtue, that when

26

these

error-discriminating

mechanisms

falter,

mistranslation 4

ACS Paragon Plus Environment

occurs.

Normally,

amino

acid

Page 5 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

misincorporation occurs at a baseline level of 0.0001 % to 0.1 % (10-6 to 10-3) in mammalian cells.1,19-21

2

However, amino acid starvation, fast cell growth, overproduction, and oxidative stress, which are

3

physiological conditions that are frequently observed in both disease states and during biopharmaceuticals

4

production, can all significantly exacerbate mistranslation by orders of magnitude.21-26 While elevated

5

mistranslation levels can lead to disease and product quality issues,2-5 recent research has suggested that

6

amino acid misincorporation may not be entirely detrimental. Growing evidence points to amino acid

7

misincorporation belonging to a set of conserved adaptive mechanisms that support cell survival under

8

stress, and are therefore an essential product of protein translation.22,27-29 To date, fairly extensive amino

9

acid misincorporation patterns have been reported.19-21,30 We, however, hypothesized that these outcomes

10

only represent a fraction of all possible biologically relevant misincorporation products, when compared to

11

the number and diversity that can be derived under stress, specifically amino acid starvation. Therefore, in

12

this effort, we sought to accomplish two primary objectives, where controlled systematic starvation was

13

employed to maximize the observation of biologically relevant misincorporation products; 1) to

14

systematically characterize amino acid misincorporation patterns and propensities, and 2) to elucidate the

15

major misincorporation mechanisms for cultured mammalian cells.

16 17

MATERIALS AND METHODS

18

Materials. Three Chinese hamster ovary (CHO) cell lines that produce different IgG2 mAb

19

molecules were chosen for this study and denoted as X, Y, and Z. Each cell line is a high-producer that is

20

derived from serum-free adapted DXB-11 cells (CS9).31 Unless otherwise specified, cells were grown at

21

36°C under 5% CO2 and shaken at 120 rpm in an automatic CO2 incubator (Thermo Fisher Scientific,

22

Waltham, MA). Media A is a chemically defined, serum-free growth media used as a control and from

23

which media variants were derived. It contains each of the twenty canonical amino acids at various

24

concentrations to support growth, glucose as the primary carbon source, vitamins, trace elements, and

25

cofactors. During culturing, pH was not controlled.

5

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

Control cell culture. CHO cell stocks were thawed and seeded in proprietary seed train media,

2

derived from soy hydrolysate, at 5 × 105 cell/mL and grown for 3 days. Cells were subcultured every 3 days

3

until a sufficient total viable cell count was acquired. At sufficient number, cells were centrifuged at 1000

4

× g for 5 mins at 25 °C. Spent media was discarded, replaced with Media A, and grown for 2 days. Cells

5

were then collected by centrifugation at 1000 × g for 5 mins at 25 °C and the supernatant was discarded.

6

Representing day 0 of the study, cell pellets were resuspended to a target viable cell density of 1.0 × 107

7

cells/mL (with > 90 % viability) in 50 mL of Media A, the control media. Cultures were grown in 250 mL

8

shake flasks. Product from day 2 was harvested and characterized.

9

Starvation cell culture. Cells that were destined for starvation were treated identically to the

10

control cell culture throughout subculturing. Representing day 0 of the study, cells were pelleted (1000 × g

11

for 5 mins at 25 °C) and then resuspended to a target viable cell density of 1.0 × 107 cells/mL (with > 90 %

12

viability) in a set of Media A variants. In order to systematically starve each amino acid, twenty variants of

13

Media A were created with each media variant designed to induce the starvation of a single amino acid. For

14

each Media A variant, the starting concentration of the amino acid, targeted for starvation, was decreased

15

(compared to control Media A), while the starting concentrations of all other amino acids were maintained

16

at their control concentration found in Media A. For the corresponding starvation cultures, the starting

17

concentration of arginine, asparagine, cysteine, glutamine, histidine, isoleucine, leucine, lysine, methionine,

18

phenylalanine, proline, threonine, tryptophan, tyrosine, and valine were set to 0.3 mM, 0.5 mM, 0.4 mM,

19

0.7 mM, 0.4 mM, 0.7 mM, 1.3 mM, 1.1 mM, 0.4 mM, 0.5 mM, 0.8 mM, 0.9 mM, 0.2 mM, 0.6 mM, 0.9

20

mM, respectively. And the starting concentration of alanine, aspartic acid, glutamic acid, glycine, and serine

21

were each set to 0 mM. These starting concentrations (at day 0) for the amino acid, targeted for starvation,

22

were determined iteratively through a set of supporting studies (data not shown), to initiate starvation

23

starting on day 1, leading to a growth rate and titer decrease (at day 1), and a subsequent increase in

24

misincorporation level that would be observed in samples taken on day 2. A target concentration threshold

25

of approximately < 0.1 mM at day 1 of culturing was used as a guideline to ensure that starvation would

26

begin at approximately the same time, in order to enable comparison between misincorporation outcomes 6

ACS Paragon Plus Environment

Page 6 of 38

Page 7 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

derived under different amino acid starvation conditions.25,26 Sampling was performed on day 2 to allow

2

for a one-day starvation period. During sampling, cell viability was noted to remain above 70 %, an

3

acceptable production final viability.

4

Antibody purification. Each mAb was purified using an immobilized Protein A affinity column

5

(Applied Biosystems, Foster City, CA) according to the manufacturer recommended procedure. The mAbs

6

were eluted under acidic conditions and detected by UV absorption at 280 nm.

7

Proteolytic digestion. The quick proteolytic digestion method described previously was

8

performed.19,32 Purified mAb (at 0.5 − 1.0 mg/mL) was denatured and reduced in Tris-buffered (pH 7.5) 7.2

9

M guanidine hydrochloride and 6 mM dithiothreitol (DTT) solution for 30 min at 37 °C, followed by

10

alkylation with 14 mM iodoacetamide for 20 min at 25 °C. The alkylation reaction was quenched by

11

stoichiometric DTT addition. The sample was then buffer-exchanged by ultrafiltration using a Vivaspin

12

500 with a 10 kDa molecular-weight cutoff membrane (Sartorius Stedim Biotech) into 0.1 M Tris buffer

13

(pH 7.5) following the manufacturer recommended procedure. Protein retentate was digested by incubation

14

with trypsin (Roche) for 1 hr at 37 °C at a substrate:enzyme mass ratio of 20:1. Tryptic digestion was

15

stopped by adjusting to approximately pH 5 using acetic acid.

16

LC−MS/MS analysis. Tryptic digests of the IgG2 mAbs X, Y, and Z were analyzed on a Thermo

17

Scientific (San Jose, CA) LTQ-Orbitrap Elite high-resolution mass spectrometer connected downstream to

18

either an Agilent (Santa Clara, CA) 1290 Infinity LC system or an Agilent 1200 LC system. Peptides were

19

separated using a Waters (Milford, MA) Acquity UPLC CSHTM C18 reversed-phase column (1.7 μm

20

particle, 150 mm × 2.1 mm) at 65 °C, followed by electrospray ionization. Peptides were eluted with an

21

acetonitrile gradient from 0.5 % to 40 % over 90 min at a flow rate of 0.2 mL/min, with 0.1 % formic acid

22

in the mobile phase. Following elution, the column was washed with a 40 % to 99 % acetonitrile gradient

23

at a flow rate of 0.2 mL/min, with 0.1 % formic acid. Approximately 20 μg of protein digest was injected

24

for each analysis. The LTQ-Orbitrap Elite instrument was set up to collect one full-scan spectrum at a

25

resolution of 120,000, followed by five data-dependent MS/MS of the most abundant ions in the linear trap.

26

MS/MS were collected with dynamic exclusion as well as automated precursor-ion exclusion, using 7

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

Amino acid misincorporation propensities 1

collision induced dissociation with 35 % normalized collision energy. Precursor-ion exclusion was

2

employed to maximize the acquisition of unique, high-quality MS/MS.33

3

Data analysis. LC-MS/MS data analysis was performed using MassAnalyzer (available in

4

Biopharma FinderTM from Thermo Scientific) according to Zhang et al.19,34 Statistical analysis was

5

performed using JMP.

6

Amino

acid

analysis.

Amino

acids

were

derivatized

with

6-aminoquinolul-N-

7

hydroxysuccinimidyl carbamate. The derivatized amino acids were separated by reversed-phase ultra-

8

performance liquid chromatography using a Waters Acquity UPLC BEHTM C18 Column, 1.7 μm, 2.1 mm

9

x 100 mm and detected by UV absorption at 260 nm. Amino acid reference standards were used for

10

quantitation.

11

Distinguishing amino acid substitutions from post-translational modifications. The method

12

described by Zhang et al., along with the published table of common modifications were used to distinguish

13

sequence variants from chemical modifications.1,19 Additionally, in this study, most amino acid

14

misincorporation products were conveniently differentiated from false positives based on elevated levels

15

resulting from targeted starvation. Only amino acid misincorporation products observed above 0.01 % (10-4)

16

were reported.

17

Amino acid misincorporation nomenclature and accounting. To properly discuss amino acid

18

misincorporation holistically, this section provides information regarding the nomenclature and means of

19

accounting for the major misincorporation mechanisms associated with a given codon. Herein, tRNA

20

recognition errors will be discussed primarily based on the interaction between the codon and anticodon, in

21

the form of the base mismatch type.19,35 There are twelve possible general base mismatch types that can

22

occur between the codon and anticodon (A/A, A/C, A/G, C/A, C/C, C/U, G/A, G/G, G/U, U/C, U/G, and

23

U/U, represented as Ncodon/Nanticodon).36,37 Permutations involving post-transcriptional modifications to tRNA

24

and mRNA were not explicitly considered within the scope of this work. The base position of the codon

25

sequence is used to describe the location of base mismatches. The following nomenclature, Ser  Asn

26

denotes the unintended substitution of serine by asparagine. Ser(AGC)  Asn, describes asparagine 8

ACS Paragon Plus Environment

Page 9 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

misincorporation at serine AGC codons. The notation, Ser  Asn (G/U), denotes the serine to asparagine

2

substitution occurring by a G/U base mismatch type.

3

Since the aim of this manuscript is to provide a general context for misincorporation mechanisms,

4

the discussion will focus on misincorporation outcomes and not on misincorporations occurring at specific

5

codons within a protein sequence. For example, Ser  Asn misincorporation can occur by a second position

6

G/U mismatch. There are two possible routes or outcomes, involving a G/U mismatch, for achieving the

7

Ser  Asn misincorporation, namely Ser(AGC)  Asn and Ser(AGU)  Asn. Although multiple AGC

8

codons and AGU codons in the protein sequence may be involved in the Ser  Asn misincorporation, the

9

successful identification of Ser(AGC)  Asn and Ser(AGU)  Asn, for example, would each count as one

10

outcome. At the codon level, there are a maximum of 16 possible misincorporation outcomes for each base

11

mismatch type at each of the first two codon positions (4 possible bases for each of the remaining two codon

12

positions). For a given base mismatch type, therefore, a total of 32 base mismatch outcomes are possible at

13

the first two codon positions. As there are 61 amino acid-encoding codons, each of which can be replaced

14

by 19 different amino acids, there are a total of 61 x 19 = 1159 possible misincorporation outcomes.

15

Since multiple base mismatches within a codon-anticodon pair is considered to be highly

16

unfavorable, misincorporations that require two or more base mismatches to occur were assumed to arise

17

by misacylation. Although misacylation occurs independently of the codon-anticodon pairing interaction,

18

misacylation can appear to arise from codon misrecognition, in the form of a base mismatch. A literature

19

survey was used to attribute misacylation to specific codons. For example, Tyr  Phe may appear to occur

20

by tRNA misrecognition error involving an A/A mismatch at the second position in the codon-anticodon

21

pair. Tyr  Phe, however, has been shown to occur by misacylation.38 For accounting purposes, Tyr 

22

Phe misacylation is assignable to both tyrosine codons, UAU and UAC, as misacylation occurs upstream

23

and independently of codon recognition. Where both tRNA misrecognition (at the first or second codon

24

position) and misacylation were suspected to produce the same misincorporation outcome, only the major

25

base mismatch types (A/C, G/U, U/C, U/G, and U/U) were attributed to the associated codons along with

26

misacylation. 9

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1 2

RESULTS

3

To date, Zhang et al.19 and others have characterized a number of amino acid misincorporation

4

products occurring at baseline levels in recombinant and native protein expressed from various organisms.1

5

Two hallmark mammalian cell culture studies demonstrated that amino acid starvation exacerbates amino

6

acid misincorporation25,26 by shifting stoichiometry in favor of sequence variant formation. Therefore, for

7

this study, controlled systematic starvation was used to promote the formation of unique sequence variants

8

for characterization purposes.

9

Amino acid misincorporation patterns derived from controlled systematic starvation. Each of

10

the three cell lines used in this study exhibited sensitivity to the starvation of the majority of amino acids

11

(arginine, cysteine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine,

12

tryptophan, tyrosine, and valine). This sensitivity manifested as significantly elevated amino acid

13

misincorporation levels above their typical baseline range (10-6 to 10-3 or 0.0001% to 0.1%). In each case,

14

elevated misincorporation corresponded with the depletion of the aforementioned amino acids from the

15

media to concentrations below approximately 0.2 mM (see Supporting Information Figure S1 for the valine

16

concentration profile for the cell line X culture). In total, 45 (amino acid-to-amino acid) misincorporation

17

products were observed and are summarized in Table 1 (and in cell line-specific detail in Supporting

18

Information Table S1). In each case, elevated amino acid substitution was specific to the starved amino

19

acid. Furthermore, for the majority of amino acids, targeted starvation resulted in multiple substitution

20

products. Supporting information Figure S2 shows some representative peptide MS/MS spectra for the

21

identification of these sequence variants.

22

Average elevated misincorporation levels ranged from 0.01 % up to 1.3 % for a given codon, with

23

a median of approximately 0.10 % (Table 1). Twelve of the 45 major misincorporation products that were

24

observed under starvation were also previously identified at baseline levels (Arg  Lys, Asn  Lys, Asn

25

 Ser, Leu  Phe, Met  Ile, Met  Thr, Phe  Leu, Thr  Ser, Tyr  Phe, Tyr  His, Val  Ala,

26

and Val  Ile).1,19 Since many of the commonly observed baseline misincorporation products can be 10

ACS Paragon Plus Environment

Page 10 of 38

Page 11 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

exacerbated by starvation to yield major misincorporation products, this suggests that the baseline pattern

2

is the natural mistranslation propensity of the corresponding encoded amino acids. This observation also

3

suggests that baseline misincorporation is predictive of the major misincorporation products that would be

4

observed under an appropriate level of starvation, when the encoded amino acid is deficient.

5

11

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

Amino acid misincorporation propensities 1 2

Table 1. Amino acid misincorporation pattern derived from controlled systematic amino acid starvation. Only amino acid misincorporation products observed above 0.01 % were reported. Starved amino acid Arg

3 4 5 6 7

AAM*

Codon (Outcomes)

RH RK RM RQ RS NK NS CS

CGC AGA; AGG AGG CGA; CGG AGA AAC; AAU AAC; AAU UGC

Proposed mechanism

Weighted Average (%)†

G/U mismatch 0.578 G/U mismatch 0.114; 0.759 Misacylation 0.053 G/U mismatch 0.120; 0.656 Wobble mismatch 0.178 Asn Wobble mismatch 0.102; 0.088 A/C mismatch 0.151; 0.125 Cys U/U mismatch, possible 0.110 misacylation CW UGU Wobble mismatch 0.111 CY UGC; UGU G/U mismatch 1.183; 0.181 Gln QH CAA; CAG Wobble mismatch 0.040; 0.113 His HQ CAC; CAU Wobble mismatch 0.394; 0.197 Ile IM AUA; AUC; AUU Wobble mismatch, Misacylation 0.327;0.264; 0.070 IN AUC U/G mismatch 0.027 IT AUA; AUC; AUU U/G mismatch, Misacylation 0.022; 0.054; 0.031 IV AUA; AUC; AUU Misacylation, A/C mismatch 0.832; 0.554; 0.644 Leu LF CUC C/A mismatch 0.094 LH CUC U/U mismatch 0.037 LM CUC; CUG Misacylation 0.049; 0.014 LT CUA; CUC; CUG; UUA Misacylation 0.058; 0.039; 0.022; 0.035 LV CUA; CUC; CUG; CUU; UUA Misacylation 0.087; 0.091; 0.062; 0.069; 0.035 Lys KH AAA; AAG Misacylation 0.170; 0.077 KM AAA; AAG Misacylation 0.135; 0.060 KN AAA; AAG Wobble mismatch 0.045; 0.097 Met M  I‡ AUG Wobble mismatch 0.425 MT AUG U/G mismatch 0.063 Phe FH UUC, UUU Misacylation 0.037, 0.034 FL UUC; UUU Misacylation, Wobble mismatch, 0.435; 0.629 U/G mismatch FM UUC Misacylation 0.035 FS UUC; UUU U/G mismatch 0.058; 0.051 FY UUC Misacylation, U/U mismatch 0.203 Pro PA CCA; CCC; CCG; CCU Misacylation 0.350; 0.227; 0.388; 0.130 PD CCA; CCC; CCG; CCU Misacylation 0.415; 0.115; 0.083; 0.560 Thr TS ACA; ACC; ACG; ACU Misacylation 0.533; 0.932; 0.517; 0.743 Trp WF UGG Misacylation 0.040 WH UGG Misacylation 0.085 WR UGG U/G mismatch, (possible U/U 0.053 mismatch) WY UGG Misacylation 0.021 Tyr YF UAC UAU Misacylation 0.092; 0.107 YH UAC UAU Misacylation, U/G mismatch 0.038; 0.010 Val VA GUC GUG GUU Misacylation, U/G mismatch 0.160; 0.051; 0.064 VF GUC G/A mismatch 0.026 V  I‡ GUA; GUC; GUG; GUU G/U mismatch, Misacylation 0.665; 1.235; 0.589; 1.168 VM GUG Misacylation, G/U mismatch 0.042 VT GUC; GUG; GUU Misacylation 0.054; 0.069; 0.045 * AAM denotes amino acid misincorporation. Single-letter annotation is used to represent amino acids. Cognate amino acids  misincorporation product. † The Weighted Average is average misincorporation abundance weighted by the number of observations in each cell line. ‡ Although both I and L are possible products, the most likely product is reported.

12

ACS Paragon Plus Environment

Page 13 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

Starvation-mediated misincorporation is dependent on the intracellular amino acid supply

2

and cellular metabolism. For the three cell lines, several amino acids, including alanine, aspartic acid,

3

glutamic acid, and glycine, have net positive intracellular production under the control condition (Media

4

A). This was also the case at low, nonzero starting concentrations (data not shown). Therefore, in order to

5

mimic their starvation, these amino acids were individually removed from the media (0 mM). Although

6

initially absent, elevated levels of unintended alanine, aspartic acid, glutamic acid, and glycine substitution

7

were not observed, indicating the absence of gross, quantifiable starvation. Examination of extracellular

8

amino acid concentrations revealed that by day 1, these amino acids had increased through cellular

9

production (see Supporting Information Figure S1 for the glycine concentration profile for the cell line X

10

culture). Since these amino acids were initially absent from the media, this indicates that the cellular amino

11

acid production was responsible for preventing starvation and for maintaining faithful translation by

12

producing sufficient concentrations of each of the initially absent amino acids. This observation

13

demonstrates that translation fidelity is primarily dependent on the capability of the cell to produce or

14

maintain a sufficient intracellular amino acid supply to meet biosynthetic and metabolic requirements. A

15

sufficient intracellular pool of functionalized amino acids, in the form of aminoacyl-tRNAs is also expected

16

to be crucial for translational fidelity.1

17

Further supporting the importance of the intracellular amino acid supply, although serine was

18

readily consumed, the cell lines displayed insensitivity to serine depletion. Elevated serine substitution was

19

not observed in any of the three cell lines, even at a 0 mM starting concentration. Since the hallmark

20

Ser(AGC)  Asn misincorporation39 was observed at baseline in all three mAbs (Supporting Information

21

Figure S3) at similar levels (10-4) compared to our previous work,19 this confirms that the serine AGC codon

22

is naturally prone to misincorporation by Asn. Since serine is a non-essential amino acid for CHO, this

23

would explain the capability of the three cell lines to maintain normal translation fidelity even at low

24

measured extracellular serine concentrations.40

25

The three CHO cell lines from this study also exhibited cell line-dependent responses with respect

26

to glutamine and asparagine depletion. These two amino acids are metabolized by high-producing CHO 13

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

cell lines for energy, and as carbon and nitrogen sources.41 As expected, when individually limited,

2

glutamine and asparagine concentrations decreased to below approximately 0.2 mM by day 1 during the

3

production of each mAb (as an example, see Supporting Information Figure S1 for the glutamine

4

concentration profile for the cell line X culture). Elevated misincorporation at glutamine residues was only

5

observed in cell line Z, which exhibited marginally elevated Gln  His misincorporation at 0.040% and

6

0.113% for the CAA and CAG codons, respectively (Supporting Information Table S1). The marginal Gln

7

 His misincorporation is consistent with high glutamine regeneration in CHO, even in the absence of

8

measurable extracellular accumulation.42 Similarly, upon extracellular asparagine depletion, elevated

9

mistranslation was only observed in mAbs produced by cell lines X and Y, in the form of Asn  Lys and

10

Asn  Ser substitution. It is possible that the heterogeneous cell line-dependent responses could have arisen

11

from significant variation in the asparagine biosynthesis pathway, or differences in the fidelity of the

12

translation machinery between the cell lines. Similar baseline asparagine misincorporation (Asn  Lys)

13

level, however, was observed in each of the three mAb products suggesting that the asparagine translation

14

machinery for the three cell lines have similar error propensities (Supporting Information Figure S3). This

15

recapitulates that cellular metabolism and the intracellular amino acid supply are the two key factors that

16

determine when a cell experiences amino acid starvation. Since elevated misincorporation consistently

17

corresponded with starvation, this also suggests that elevated misincorporation is an indicator of amino acid

18

starvation.

19 20

DISCUSSION

21

Common misincorporation mechanisms and propensities exist between cell lines and

22

organisms. Of the 45 (amino acid-to-amino acid) elevated misincorporation products that were observed

23

in this study, approximately 70 % were common to all three cell lines, and more than 90 % were observed

24

in two out of the three cell lines (Supporting Information Table S1). This commonality is expected as the

25

three CHO cell lines used in this study are derived from the same parent cell line. However, since amino

26

acid misincorporation is widely considered to be universal,1,19,21 we hypothesized that commonalities may 14

ACS Paragon Plus Environment

Page 14 of 38

Page 15 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

also exist between organisms. As a basis for comparison, we performed an extensive literature survey to

2

compile a list of previously reported amino acid misincorporations observed in recombinant proteins

3

expressed in E.coli. To the best of our knowledge, at the time of this work, a total of 23 (amino acid-to-

4

amino acid) misincorporation products had been characterized in recombinant proteins expressed in E. coli

5

(Supporting Information Table S2). Upon comparison, 20 of the 23 misincorporation products found in E.

6

coli-expressed protein were also observed in either recombinant protein expressed in CHO or native human

7

protein, a substantial overlap that is unlikely to have occurred stochastically. Furthermore, among these 20

8

misincorporation products, eleven (Arg  Gln, Arg  Lys, Asn  Lys, Gln  His, His  Gln, Leu 

9

Val, Met  Ile, Met  Thr, Phe  Leu, Val  Ile, Val  Met) were one of the major elevated CHO

10

misincorporation products derived from starvation in this study (Table S2). Although E. coli and

11

mammalian tRNA synthetases and ribosomes are known to be structurally different, the significant pattern

12

overlap suggests the similarity of amino acid misincorporation mechanisms and propensities between

13

different organisms (Table S2).19

14

Amino acid misincorporation occurs with distinct propensities. To further investigate whether

15

mistranslation follows specific propensities and to explore the mechanistic context for each

16

misincorporation product, we began by examining codon-level differences between the observed

17

misincorporation products and their respective cognate amino acids. Similar to the observations made by

18

Zhang and colleagues,19 the codon of a misincorporation product often differed from that of the cognate

19

amino acid by only one base. In other words, while not necessarily the cause, most observed

20

misincorporation products can be arrived at by a potential single base mismatch between the codon-

21

anticodon pair (Figure 1A). Each base mismatch type (horizontal axis in Figure 1A), at the first two codon

22

positions, has 32 maximum outcomes. For example, for G/U mismatch type at the first codon position,

23

since there are 4 possible bases at each of the second and third position, there are a total of 16 possible

24

outcomes. Similarly, there are 16 possible outcomes for G/U mismatch at the second codon position. If base

25

mismatching were to occur stochastically, it would be expected that the probability of observing each base

26

mismatch outcome is nearly equivalent. And therefore, the number of base mismatch types should exhibit 15

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

a nearly even distribution. However, the observed distribution differed significantly from the even

2

distribution for the stochastic process, validating the idea that amino acid misincorporation is not a

3

stochastic process, but rather follows a set of propensities. This is logically consistent with the results

4

obtained by Manickam showing that most potential base mismatches are discriminated against by the

5

ribosome.35 1

2

6 7 8 9 10 11 12 13 14 15 16 17

3

Figure 1. A) The number of potential base mismatch outcomes. B) The number of potential base mismatch outcomes that are explicable by previously reported misacylation. The base mismatch represents the most likely codon-anticodon base mismatch that would be required to produce the misincorporation product from the codon of the cognate amino acid. The vertical axis represents the number of potential base mismatch outcomes out of the maximum 32 possible base mismatch outcomes (i.e. A/A, G/U, etc.). A) Black columns represent base mismatches occurring at codon position 1. White columns represent base mismatches occurring at codon position 2. The gray column represents base mismatches at codon position 3, the wobble position. B) Black columns represent the total number of potential base mismatch outcomes for a base mismatch type. The white column represents the number of base mismatch outcomes that are explicable by misacylation.

16

ACS Paragon Plus Environment

Page 16 of 38

Page 17 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

G/U and U/G mismatch propensity. Amongst the base mismatches occurring at the first two

2

codon positions, potential G/U and U/G mismatches, exhibited the highest overall frequencies of 11 (of 30

3

possible base mismatch outcomes, not counting the 2 outcomes associated with stop codons) and 12 (of 27

4

possible base mismatch outcomes, not counting the 3 outcomes associated with stop codons and 2 outcomes

5

associated with the degeneracy of leucine codons), respectively (Figure 1A). Together, potential G/U and

6

U/G mismatch outcomes occurred at a disproportionately high rate, representing 44 % of the total number

7

of observed outcomes (Figure 1A).16,43 This validates findings made by Zhang et al., showing that

8

unintended amino acid substitutions exhibit a strong association with potential G/U mismatches.19

9

Consistent with these results, both G/U and U/G mismatches are known to be biologically relevant and play

10

essential roles in chemistry, structure, and ligand-binding.44 When compared to other base mismatch types,

11

G/U mismatches are thermodynamically stable and relatively undisruptive to RNA structure.37,44 Aiding in

12

increasing the prevalence of G/U and U/G mismatches is their capability to adopt a canonical G-C base-

13

pair geometry by undergoing keto-enol tautomerization.45,46 When these mismatches are located at the first

14

or second position in the codon sequence, the ribosome can sometimes sterically facilitate G/U and U/G

15

transformation to a canonical geometry.16,43

16

Third position base mismatch propensity. As shown in Figure 1A, third codon position wobble

17

mismatches occur at the highest frequency amongst all misincorporation outcomes with a total of 16

18

outcomes observed (out of 27 possible outcomes, including all codons that may code a different amino acid

19

after a change of the third base). The third codon position is known to be susceptible to base mismatching.12

20

The codons for seven pairs of amino acids (Asn/Lys, Asp/Glu, Cys/Trp, His/Gln, Ile/Met, Phe/Leu, and

21

Ser/Arg), sharing the same first two bases in their codon sequence, can undergo third position base

22

mismatches (Table 2). From these seven pairs, with 14 possible amino acid-to-amino acid

23

misincorporations, our work identified nine elevated misincorporations that can potentially occur by a third

24

position mismatch (Arg  Ser, Cys  Trp, His  Gln, Gln  His, Ile  Met, Met  Ile, Lys  Asn,

25

Asn  Lys, Phe  Leu) (Table 1). Since the cell lines were insensitive to Asp, Glu, and Ser depletion, the

26

Asp  Glu, Glu  Asp, and Ser (AGU, AGC)  Arg misincorporations were not observed at elevated 17

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

levels. These three misincorporations, however, have been previously observed at baseline levels in E. coli

2

as well as CHO-expressed recombinant protein and native human protein.19,20,47 Leu(UUA, UUG)  Phe

3

misincorporation was not detected since the two leucine codons, UUA and UUG, which are required for

4

the substitution, have low codon usage (0 – 6 %) in the three mAbs. Higher UUA or UUG codon usage

5

would have likely facilitated observation of Leu(UUA, UUG)  Phe misincorporation. And lastly, the Trp

6

 Cys substitution was not detected. To date, out of 14 possible amino acid-to-amino acid

7

misincorporations that can arise from a third position mismatch only Leu(UUA, UUG)  Phe and Trp 

8

Cys have not been successfully observed. As the majority of misincorporations involving a potential third-

9

position base mismatch have been observed, these results are consistent with the third position being prone

10

to base mismatches that lead to amino acid misincorporation.19,48

11

Codon recognition at the third position in the codon-anticodon pair is often accomplished through

12

wobble base pairing.49,50 A wobble interaction is necessitated because many organisms, including CHO and

13

humans use a single isoacceptor tRNA (tRNAGNN, where N denotes a nucleoside) to decode two codons

14

(NNU and NNC) for asparagine, aspartic acid, cysteine, histidine, phenylalanine, and serine.51 A list of

15

potential base mismatches at the third position for CHO are summarized in Table 2. In the direction of the

16

Lys  Asn, Glu  Asp, Gln  His, Leu  Phe, and Arg  Ser substitutions, the cognate NNA and NNG

17

codons, are incorrectly decoded by tRNAGNN. Therefore, only A/G and G/G wobble position mismatches

18

are feasible in this misincorporation direction, barring misacylation. Since the Gln  His, Lys  Asn, and

19

Arg  Ser sequence variants were observed, and have not been reported to occur by misacylation, this

20

suggests that the Gln  His, Lys  Asn, and Arg  Ser proceed by way of A/G and/or G/G mismatches.

21

Since many of the misincorporations occurring in the same direction were observed, the A/G and/or G/G

22

mismatches would likely explain their origins as well. In the opposite direction, (Asn  Lys, Asp  Glu,

23

His  Gln, Phe  Leu, and Ser  Arg substitutions), two tRNAs (tRNAUNN and tRNACNN) are available

24

to decode the codon for the substituting amino acid. Following the same logic, since the Cys  Trp, Asn

25

 Lys and His  Gln sequence variants were observed, then U/C, U/U, C/U, and C/C mismatches are

26

feasible at the third position and likely explains the observation of each of the misincorporation outcomes 18

ACS Paragon Plus Environment

Page 18 of 38

Page 19 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

occurring in this direction (Table 2).52 These results are in accordance with base mismatch patterns that

2

have been observed previously (A/G, C/C, C/U, U/C, U/U),35,50,53 and are consistent with the third position

3

accommodating a diversity of wobble geometries and base mismatches. It is, however, important to note

4

that the third codon position is often modified to facilitate the accurate translation of synonymous codons

5

by a single tRNA. Given how modifications modulate base pairing accuracy, and one of the most prominent

6

modifications, inosine, having the capability to base pair with multiple bases, tRNA modifications are likely

7

to play an integral role in translation error propensities associated with the third-position base mismatches.54

8 9

19

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 38

Amino acid misincorporation propensities 1

Table 2. Possible third position base interactions and mismatches. Amino acid

Codon (mRNA)†

Asn, Asp, Cys, His, Phe, Ser

NNU

Arg, Gln, Glu, Leu, Lys, Trp

NNA

Ile

AUU

Met

2 3 4 5 6 7 8 9

Substituting Anticodon (tRNA)†,‡,§ CNN, UNN,

NNC

Potential base mismatch †,§ U/C, U/U C/C, C/U

GNN

NNG

A/G G/G

CAU

U/C

AUC

C/C

AUA

A/C

AUG

AAU, UAU

G/A, G/U

codon anticodon codon anticodon codon anticodon codon anticodon codon anticodon codon anticodon codon anticodon codon anticodon

Resulting Misincorporation* 5’-NNU-3’ 3’-NNU-5’ 5’-NNC-3’ 3’-NNU-5’ 5’-NNA-3’ 3’-NNG-5’ 5’-NNG-3’ 3’-NNG-5’ 5’-AUU-3’ 3’-UAC-5’ 5’-AUC-3’ 3’-UAC-5’ 5’-AUA-3’ 3’-UAC-5’ 5’-AUG-3’ 3’-UAA-5’

5’-NNU-3’ 3’-NNC-5’ 5’-NNC-3’ 3’-NNC-5’

C  W, D  E, F  L, H  Q, N  K, S  R E  D, K  N, L  F, R  S, Q  H, W  C IM

5’-AUG-3’ 3’-UAU-5’

MI

* Single-letter annotation is used to represent amino acids. † Bases involved in third position pairing are underscored and in bold. ‡ Substituting anticodon denotes the anticodon of the substituting amino acid. Anticodon assignment is based on wobble base pair rules. Assignments do not consider wobble position (N34) post-transcriptional modifications. § tRNA anticodon sequences were obtained from the GtRNAdb: Genomic tRNA Database for Cricetulus griseus cell line CHO-K1 (Chinese hamster ovary CriGri 1.0 Aug 2011).

10

Misacylation propensity. Multiple base mismatches during codon-anticodon pairing is

11

thermodynamically unfavorable. For example, molecular dynamic simulation of codon-anticodon binding

12

revealed that an additional mismatch at the third codon position increases the binding free energy by 1.6 –

13

2.0 kcal/mol over a single U/G base mismatch at the first two codon position,55 corresponding to 14 – 26

14

folds reduction in binding affinity at 36°C (cell culture temperature). Contribution of mismatch at the first

15

two positions is expected to be even larger due to the less stringent nature of the third base during codon-

16

anticodon paring. Therefore, misincorporation products that required at least two base mismatches to occur

17

were assumed to be caused by misacylation. An example is the mistranslation of Trp as Phe and His, which

18

in order to occur, requires two- and three-base mismatches in the codon-anticodon pair, respectively. Based

19

on this criterion, twelve misacylation products were identified (Table 1 and Supporting Information Table

20

S1). Among these, eight new misacylation products were identified through this work (Leu  Thr, Lys 

21

His, Phe  His, Phe  Met, Pro  Asp, Trp  His, Trp  Phe, Trp  Tyr) (Table 1). The remaining

22

four include Leu  Met, Lys(AAA)  Met, Val(GUG)  Ile, and Val  Thr.

20

ACS Paragon Plus Environment

Page 21 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

Since similarity between amino acid substrates is a major driver of misacylation,14 as would be

2

expected, in numerous cases, the encoded and substituting amino acids share varying degrees of structural

3

and chemical similarity. Along with our work here, a number of studies have shown that aromatic amino

4

acids exhibit a tendency to replace one another.9,13,25 The exception is that while tryptophan can be readily

5

replaced by each of the other three aromatic amino acids (phenylalanine, tyrosine, and histidine), conversely

6

it is unamenable to replacing the other aromatic amino acids. Work by Kotik-Kogan et al., has shown that

7

tryptophan is simply too large compared to the other aromatic amino acids and is precluded by active site

8

specificity.13 Although the Lys  His and Phe  Met substitutions lack obvious structural similarity, each

9

of these misincorporation products shares the same charge state at neutral pH with the cognate amino acid;

10

lysine and histidine are positively charged, and both phenylalanine and methionine are uncharged and

11

hydrophobic. Aside from amino acid misrecognition, misacylation may be generated from tRNA

12

misrecognition.56,57 The Lys  Met misacylation may proceed by tRNA recognition error. Although MetRS

13

may successfully recognize its cognate amino acid, methionine, it can mismethionylate non-cognate

14

tRNAArg and tRNALys.15,58 In a few cases, where the structural and/or chemical similarity between the

15

misacylation product and cognate amino acid is difficult to identify (Leu  Thr and Pro  Asp), further

16

investigation is required to elucidate the mode of misacylation.

17

Accounting of amino acid misincorporation mechanisms. Since, to date, a large set of

18

misacylation products have been characterized, it was suspected that a number of the misincorporation

19

products that were identified in this study may be attributable to misacylation. When compared to a

20

literature survey (Supporting Information Table S3), approximately half of the observed misincorporations

21

(amino acid-to-amino acid) were explicable by misacylation (Table 1). Given the substantial overlap, we

22

suspected that this may, in part, also explain the non-random base mismatch pattern represented by the

23

black columns in Figure 1B. In order to evaluate whether misacylation products could explain the base

24

mismatch pattern, the misacylations from Table S3 were converted to potential base mismatches between

25

the codon of the cognate amino acid and the most likely anticodon of the substituting amino acid,

26

represented by the white columns in Figure 1B. After applying this approach, most of the base mismatches 21

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

(79 %), with the exception of G/U, U/G, and wobble position mismatches were attributable to misacylation.

2

Alternatively, only 36 %, 67 % and 13 % of G/U, U/G, and wobble mismatches, respectively, were

3

attributable to misacylation (Figure 1B). The base mismatches, excluding G/U, U/G, and wobble, that were

4

not attributable to misacylation were Asn  Ser (A/C), Cys  Ser (U/U), Leu  Phe (C/A), and Val 

5

Phe (G/A). Although Asn  Ser and Cys  Ser have not yet to be experimentally verified to occur by

6

misacylation. We hypothesize that they are both attributable to misacylation due to the close amino acid

7

structural similarity.59

8

G/U and U/G base mismatches are involved in the formation of many misincorporation

9

products but have distinctly different behavior. Given the various levels of involvement of base

10

mismatch types in misincorporation, next we tried to explain any behavioral differences between the base

11

mismatches. When plotted (Figure 2), misincorporation products derived from G/U mismatches exhibited

12

the highest abundance (0.52 ± 0.15 %). Most striking was that while G/U- and U/G-derived mistranslation

13

products might be expected to have very similar, if not identical abundance, G/U-derived misincorporation

14

products exhibited a significantly higher abundance (0.52 ± 0.15 %) than those derived from U/G

15

mismatches (0.06 ± 0.01 %) (two-tailed, Student’s t-test, p = 0.002) and the collection of other base

16

mismatch types (two-tailed, Student’s t-test, p = 0.0007). At the same time, the abundance of mistranslation

17

products associated with U/G mismatches was statistically indistinguishable from that of other base

18

mismatches (0.08 ± 0.02 %) (two-tailed, Student’s t-test, p = 0.9). This behavioral divergence is a contrast

19

to the high frequency of misincorporation produced by G/U and U/G base mismatches (Figure 1B). This

20

suggests that while both G/U and U/G mismatches are involved in the formation of many unique

21

misincorporation products, G/U mismatches are significantly more favorable than other base mismatch

22

types and/or experience more opportunities for forming.46 This observation is also consistent with reported

23

baseline amino acid misincorporation levels and patterns.19 The results presented in Figure 2 are also in full

24

accordance with observations that G/U base mismatches are the most amenable within the ribosome

25

decoding center and that most base mismatches are discriminated against by the ribosome.35,46,60

26 22

ACS Paragon Plus Environment

Page 22 of 38

Page 23 of 38

Amino acid misincorporation propensities

0.80%

Average abundance

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

*

0.60% 0.40% 0.20% 0.00%

1

G/U U/G Other Base mismatch (codon/anticodon)

2 3 4 5 6 7

Figure 2. The average abundance of misincorporation outcomes resulting from the designated base mismatch type that are not attributable to misacylation (n ≥ 3). The average abundance represents the abundance of the sequence variant detected by LC-MS/MS relative to the native form. The difference between G/U base mismatch and each of the other base mismatch types is statistically significant (*, twotailed, Student’s t-test, p < 0.005).

8

Mechanistic categorization of misincorporation outcomes. Since misacylation and base

9

mismatching are independent processes, in principle, the two mechanisms can occur concomitantly in an

10

additive manner that would be reflected in their total abundance. The Phe  Leu sequence variant is an

11

example in which multiple misincorporation mechanisms can act on a single amino acid to lead to the same

12

product. This sequence variant can originate from misacylation,61 a third-position mismatch,12,19 and a U/G

13

mismatch at the first codon position.60 In order to consider this conceptually, we categorized the elevated

14

misincorporation outcomes from this study based on their proposed mechanisms, and to include categories

15

for products that potentially arise from multiple mechanisms (Figure 3A). With the exception of G/U

16

mismatches and third-position wobble mismatches, all other base mismatches were categorized together,

17

given that they exhibited similar average abundance (Figure 2). Since both A/A46 and C/C35,46 mismatches

18

are unfavorable in the first and second codon positions, and every instance was explicable by misacylation

19

(Figure 1B), they were not categorized as base mismatches. The categorization for individual

20

misincorporation outcomes is listed under the proposed mechanisms column in Table S1. The average

21

misincorporation abundance, resulting from each mechanism or set of mechanisms is presented in Figure

23

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

3A. Products derived from other base mismatches (excluding G/U and third-position wobble mismatches)

2

exhibited the lowest abundance (0.07 ± 0.01 %) while products resulting from G/U, wobble, and

3

misacylation exhibited abundances of 0.52 ± 0.15 %, 0.16 ± 0.04 %, and 0.19 ± 0.03 %, respectively. Most

4

notably, the abundance of misincorporation products derived from G/U mismatches was statistically higher

5

than any other independently occurring mechanism (two-tailed Student’s t-test, p = 0.0004 up to 0.005). A

6

similar pattern was observed in the baseline misincorporation levels (Figure 3B) and supports the idea that

7

baseline misincorporation reflects the disposition for elevated mistranslation under starvation conditions.

8

24

ACS Paragon Plus Environment

Page 24 of 38

Page 25 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1 1

2

3 4

2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 3. A) The average abundance for each misincorporation mechanism derived from systematic starvation. B) The average abundance for each misincorporation mechanism occurring at baseline in the absence of starvation. The average abundance of each misincorporation mechanism category is the average abundance of misincorporation outcomes assigned to the mechanism category. A total of 90 misincorporation outcomes were uniquely assigned to a given mechanism category. Error bars denote the standard error (n ≥ 3 for each mechanism category). Other mismatches, in the last two columns, represents all base mismatch types with the exception of G/U mismatches. All potential A/A and C/C base mismatches were attributable to misacylation and were not counted towards base mismatches. A) G/U base mismatches were determined to be statistically different from any other single mechanism categories (*, two-tailed Student’s t-test, p < 0.005). G/U + misacylation was determined to be statistically different from all other categories (**, two-tailed Student’s t-test, p < 0.02), except for the G/U mismatch category. B) Misincorporation outcomes were assigned in the same manner as in A).

16

Both misacylation and base mismatching are common and major amino acid

17

misincorporation mechanisms. In total, 90 unique misincorporation outcomes were identified on a codon

25

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

basis (Table 1). The total number of outcomes is greater than the total number of codons, 64, since a single

2

codon can result in multiple differentiable misincorporation outcomes. For example, the Trp UGG codon

3

resulted in four differentiable misincorporation outcomes (Trp  Phe, Trp  His, Trp  Arg, Trp  Tyr).

4

When the frequency of each mechanism was counted on a codon basis, it was found that 52 (out of 90)

5

misincorporation outcomes can be attributed to base mismatches, in part or entirely. A total of 61 (out of

6

90) misincorporation outcomes can be attributed to misacylation, in part or entirely (Figure 4). The sum of

7

these values is greater than the 90 misincorporation outcomes since many outcomes can be attributed to

8

both base mismatching and misacylation. As both mechanisms are associated with a similar number of

9

outcomes, this substantiates the idea that both codon-anticodon mispairing and misacylation are major

10

amino acid misincorporation mechanisms. Furthermore, these results indicate that both mechanisms are

11

involved in a similar number of substitutions on a codon basis (Figure 4), but the abundance arising from

12

each mechanism varies (Figure 3A). For example, misincorporations generated from G/U base mismatches

13

were found to be significantly more abundant than those derived from misacylation.

14

15 16 17 18 19

Figure 4. The number of misincorporation outcomes assigned to each mechanism category. A total of 90 misincorporation outcomes were assigned. Misincorporation outcome assignment was performed in the same manner as in Figure 3A and 3B.

20 26

ACS Paragon Plus Environment

Page 26 of 38

Page 27 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

Although codon-anticodon mispairing and misacylation are involved in a similar number of

2

misincorporation outcomes, the two mechanisms do not have similar propensity. As described earlier, there

3

are a total of 30 possible outcomes that can be generated from first or second-position G/U mismatches,

4

and 27 possible outcomes that can be generated from third-position wobble mismatches. There are,

5

however, a total of 61 × 19 = 1159 possible outcomes that can be generated by misacylation (61 amino acid

6

coding codons, each mistranslated as one of 19 different amino acids). The reason more misacylation

7

outcomes were observed than G/U or wobble mismatches is likely due to the large number of possible

8

misacylation outcomes. Taking these possibilities into consideration, G/U and wobble mismatches during

9

codon recognition occurs at a much higher probability, and therefore a greater propensity than misacylation.

10

This explains why misincorporations generated from misacylation generally have low abundance, except

11

with a few amino acid pairs that share very high structural similarity (T  S, I  V, V  I, and Y  F,

12

for example). The conclusion also supports the notion that if a misincorporation can be generated from a

13

G/U or wobble mismatch, it is most likely the main cause.19

14

It is worth noting that the approach to employ a literature survey to attribute misincorporation

15

mechanism(s) to each codon has potential shortcomings. The attribution robustness depends on the

16

availability of mechanistically characterized misincorporations. While to date an extensive number of

17

misacylations have been confirmed through in vitro assays62 or a different means, we suspect that Table S3

18

contains only a fraction of possible biologically relevant misacylation outcomes. For example, although not

19

experimentally verified and reported in the literature, we hypothesize that Cys  Ser misincorporation is

20

attributable to misacylation due the close structural and chemical similarity between the two amino acids.

21

Similarly, Asn  Ser59 and Ser  Asn63 may both, in part, be attributed to misacylation due to chemical

22

and structural similarity. The Ser  Asn misacylation has been proposed to occur by a seryl-tRNA

23

synthetase-mediated error based on in silico docking analysis.63 Since, however, experimental verification

24

have not been conducted, these cases were not included in Table S3.

25

Evidence of mechanistic additivity. Our analysis indicated that both tRNA misrecognition and

26

misacylation are major amino acid misincorporation mechanisms. This is further supported by the 27

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

observation of additive effect between the two mechanisms. Since the two mechanisms are expected to

2

occur independently, one would expect to see additive behavior when a misincorporation can happen

3

through both mechanisms, On the other hand, if a misincorporation happens through only one of the two

4

mechanisms, one would not expect the additive effect. Evidence of mechanistic additivity was indeed

5

observed when the misincorporation products were categorized based on their proposed mechanistic

6

origin(s) (Figure 3A). Substitution products suspected to arise from both a G/U mismatch and misacylation

7

exhibited an average abundance of 0.78 ± 0.28 % which is approximately the sum of the average abundance

8

for G/U mismatches (0.52 ± 0.15 %) and misacylation (0.19 ± 0.03 %) occurring individually (Figure 3A).

9

Similarly, products suspected to arise from a both third-position (wobble) mismatch and misacylation

10

exhibited an average abundance of 0.34 ± 0.9 % which is approximately the sum of the average abundance

11

of third-position mismatches (0.16 ± 0.04 %) and misacylation (0.19 ± 0.03 %) occurring individually

12

(Figure 3A).

13

Val  Ile mistranslation also conveniently provides evidence of mechanistic additivity. Valine is

14

encoded by four codons (GUA, GUC, GUG, GUU), all of which underwent misincorporation by Ile. The

15

Val(GUG)  Ile substitution is assumed to arise exclusively by misacylation, as it would otherwise require

16

two base mismatches to occur. As misacylation occurs upstream of tRNA recognition, Val  Ile

17

misacylation would also be expected to occur at each of the other valine codons. In addition to misacylation,

18

Val(GUA, GUC, GUU)  Ile may also occur by a G/U base mismatch, and would therefore be expected

19

to have higher abundance than Val(GUG)  Ile misacylation due to the mechanistic additivity. Val(GUG)

20

 Ile was found to have an average abundance of 0.59 ± 0.15 % (Figure 5A). The Val(GUA)  Ile

21

substitution abundance was 0.66 ± 0.14 %. The Val(GUC)  Ile and Val(GUU)  Ile substitutions

22

exhibited significantly higher abundances than the Val(GUG)  Ile misacylation level with values of 1.24

23

± 0.09 % and 1.17 ± 0.23 %, respectively (two-tailed Student’s t-test, p = 0.0005 and p = 0.04, respectively)

24

(Supporting Information Table S1). The differences in average abundance is consistent with both G/U

25

mismatching and misacylation acting concomitantly for Val(GUA, GUC, GUU)  Ile misincorporation.

26

Another piece of information can also be derived from this set of outcomes. The significant differences 28

ACS Paragon Plus Environment

Page 28 of 38

Page 29 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

between the Val(GUA)  Ile, and the Val(GUC)  Ile (two-tailed Student’s t-test, p = 0.0004) and

2

Val(GUU)  Ile (two-tailed Student’s t-test, p = 0.044) misincorporations, given that all three outcomes

3

likely occur by both misacylation and a G/U mismatch suggests that other factors influence susceptibility

4

of a codon to misincorporation. 1

2

5 6 7 8 9 10 11 12 13 14 15 16 17

3 4

Figure 5. A) The average abundance of Val  Ile misincorporation outcomes. B) The average abundance of misincorporation outcomes derived from a G/U base mismatch at codon position 2. On the horizontal axis, the single letter represents the base at the third codon position. A) For Val  Ile misincorporation, the GUA, GUC and GUU codons can each occur by a G/U mismatch at codon position 1 and misacylation. Val(GUG)  Ile misincorporation products are derived exclusively from misacylation. Higher misincorporation abundance occurred when C or U was located at the third codon position (*, twotailed Student’s t-test, p < 0.05). B) The horizontal line above the columns denotes the amino acid-to-amino acid misincorporation. The difference in Arg  Lys misincorporation abundance resulting from a G versus A at codon position 3 was found to be statistically significant (*, two-tailed, Student’s t-test, p < 0.05). The difference in Cys  Tyr misincorporation abundance resulting from a C versus U at codon position 3 was found to be statistically significant (**, two-tailed, Student’s t-test, p < 0.002).

29

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

It is important to note, however, that substitutions by Ile/Leu are indistinguishable by mass

2

spectrometry. In this case, however, the Ile substitution is likely to be the major product for a number of

3

reasons. First, while Val  Ile has been reported as a major misincorporation product, the Val  Leu

4

misincorporation has not been previously reported. Secondly, although Val  Leu misincorporation could

5

hypothetically occur by misacylation, or a G/G or G/A base mismatch at the first position in the codon-

6

anticodon pair, G/A and G/G would only be expected to exist at ~0.07 % similar to other (non-G/U)

7

mismatches (Figure 3A). Moreover, the G/G and G/A are considered amongst the least favorable base

8

mismatch interactions at the first two codon positions.35 Ultimately, the sum of the values from these

9

potential contributions, do not account for the entirety of the codon-specific abundance discrepancies

10

between the various Val  Ile/Leu misincorporations. A G/U mismatch, as aforementioned, which has an

11

average abundance of 0.52 % sufficiently accounts for the differences in abundance, suggesting that Ile

12

substitution is the major misincorporation product.

13

Codon-associated misincorporation behavior. The Val  Ile misincorporation results suggested

14

that codon-associated factors can cause differences in misincorporation abundance between synonymous

15

codons. Exploring this further, for misincorporations involving a second position G/U mismatch (Arg 

16

Lys, Arg  Gln, and Cys  Tyr), abundance was significantly higher when base pairing at the third codon

17

position was accomplished by a G (two-tailed, Student’s t-test, p = 0.025) and C (two-tailed, Student’s t-

18

test, p = 0.001), rather than an A and U, respectively (Figure 5B). Similarly, the Arg(CGC)  His

19

misincorporation was observed while the Arg(CGU)  His was not (Table 1). The same pattern was also

20

observed at baseline levels in our previous work (Arg  Lys, in the human sample).19 Similarly, higher

21

misincorporation propensity at NNC codons was also observed in baseline Ser  Asn and Gly  Asp

22

misincorporation in CHO cells.19,39 Next we compared third codon position C and U behavior between

23

Val(GUC, GUU)  Ile and Cys(UGC, UGU)  Tyr. Both sets of codons have the same base composition,

24

but differ by the position of the G/U mismatch during mistranslation (positions 1 versus 2, respectively).

25

For Val  Ile, the misincorporation abundance at NNU and NNC are essentially identical (Figure 5A),

26

where alternatively the misincorporation abundance at NNU and NNC are statistically different for Cys  30

ACS Paragon Plus Environment

Page 30 of 38

Page 31 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

Tyr substitution (Figure 5B) (two-tailed Student’s t-test, p = 0.001). One explanation for codon-associated

2

misincorporation behavior is that neighboring base pairs to a mismatch can influence the codon-anticodon

3

interaction stability. Both G/U and U/G mismatches have been shown to impart a steric effect on their

4

nearest neighbors in the form of buckling across the planar interface of the bases.60 Therefore, it is logical

5

that these neighbors would also apply a reciprocating influence, in a manner that would be reflected in

6

misincorporation abundance. Manickam et al. demonstrated that tRNA modifications can modulate the

7

codon-anticodon interaction, in the presence of base mismatches.64 A second explanation is that the tRNA

8

profile, including the degree and type of post-transcriptional modifications, plays a role in determining

9

codon-associated misincorporation propensity.12

10 11

CONCLUSION

12

We sought to employ controlled systematic starvation as a means to maximize the observation of

13

misincorporation products. Our analysis suggested that amino acid misincorporation is not a stochastic

14

phenomenon and follows similar propensities between different organisms. These propensities and the

15

ability to induce starvation-mediated misincorporation is dependent on capability of the cell to maintain a

16

sufficient intracellular pool of amino acid substrates. Our analysis also suggested that both tRNA

17

misrecognition and misacylation are major amino acid misincorporation mechanisms that are involved in a

18

similar number of unintended amino acid substitutions on a codon basis. Misincorporations generated from

19

G/U and wobble base mismatches, however, have on average significantly higher abundances than those

20

derived from misacylation, except for misacylations involving amino acid pairs with very high structural

21

similarity. These conclusions were further supported by the observation of additive effect of the two

22

independent mechanisms. And lastly, we have provided evidence that misincorporation abundance can be

23

codon-associated, and explored factors that can influence misincorporation in a codon-associated manner.

24 25

Supporting information

31

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

Amino acid misincorporation patterns derived from controlled systematic amino acid starvation of

2

three CHO cell lines; Amino acid misincorporation pattern commonality between organisms; Previously

3

reported misacylation products; Representative peptide MS/MS spectra for identification of sequence

4

variants; Representative glutamine, glycine, and valine extracellular concentration profiles of a cell line

5

during culturing; Average baseline abundance of serine and asparagine misincorporation outcomes. This

6

material is available free of charge via the Internet at http://pubs.acs.org.

7 8

ACKNOWLEDGMENTS

9

We thank the Rapid Analysis group at Amgen Inc., especially Mee Ko, Justin Blazek, and Skyler

10

Smith for performing amino acid analysis and protein purification in support of this work. We also thank

11

Jason Richardson and Bhavana Shah for assistance provided in collecting LC-MS/MS data. This work was

12

funded by Amgen Inc.

13 14

Conflicts of Interest

15 16 17 18 19

The authors declare no competing financial interest.

References (1)

20 21

Wong, H. E., Huang, C., Jr., and Zhang, Z. (2018) Amino acid misincorporation in recombinant proteins, Biotechnol. Adv. 36, 168-181.

(2)

Lee, J. W., Beebe, K., Nangle, L. A., Jang, J., Longo-Guess, C. M., Cook, S. A., Davisson, M. T.,

22

Sundberg, J. P., Schimmel, P., and Ackerman, S. L. (2006) Editing-defective tRNA synthetase

23

causes protein misfolding and neurodegeneration, Nature 443, 50-55.

24

(3)

Vermulst, M., Denney, A. S., Lang, M. J., Hung, C. W., Moore, S., Moseley, M. A., Thompson,

25

J. W., Madden, V., Gauer, J., Wolfe, K. J., Summers, D. W., Schleit, J., Sutphin, G. L., Haroon,

26

S., Holczbauer, A., Caine, J., Jorgenson, J., Cyr, D., Kaeberlein, M., Strathern, J. N., Duncan, M.

27

C., and Erie, D. A. (2015) Transcription errors induce proteotoxic stress and shorten cellular

28

lifespan, Nat. Commun. 6, 8065.

29 30

(4)

Drummond, D. A., and Wilke, C. O. (2008) Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution, Cell 134, 341-352.

32

ACS Paragon Plus Environment

Page 32 of 38

Page 33 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

(5)

2 3

Singh, S. K. (2011) Impact of product-related factors on immunogenicity of biotherapeutics, J. Pharm. Sci. 100, 354-387.

(6)

Randhawa, Z. I., Witkowska, H. E., Cone, J., Wilkins, J. A., Hughes, P., Yamanishi, K., Yasuda,

4

S., Masui, Y., Arthur, P., Kletke, C., Bitsch, F., and Shackleton, C. H. L. (1994) Incorporation of

5

norleucine at methionine positions in recombinant human macrophage colony stimulating factor

6

(M-CSF, 4-153) expressed in Escherichia coli: structural analysis, Biochemistry 33, 4352-4362.

7

(7)

Apostol, I., Levine, J., Lippincott, J., Leach, J., Hess, E., Glascock, C. B., Weickert, M. J., and

8

Blackmore, R. (1997) Incorporation of norvaline at leucine positions in recombinant human

9

hemoglobin expressed in Escherichia coli, J. Biol. Chem. 272, 28980-28988.

10

(8)

Gurer-Orhan, H., Ercal, N., Mare, S., Pennathur, S., Orhan, H., and Heinecke, Jay W. (2006)

11

Misincorporation of free m-tyrosine into cellular proteins: a potential cytotoxic mechanism for

12

oxidized amino acids, Biochem. J. 395, 277-284.

13

(9)

Klipcan, L., Moor, N., Kessler, N., and Safro, M. G. (2009) Eukaryotic cytosolic and

14

mitochondrial phenylalanyl-tRNA synthetases catalyze the charging of tRNA with the meta-

15

tyrosine, Proc. Natl. Acad. Sci. U. S. A. 106, 11045-11048.

16

(10)

17 18

engineering, Curr. Opin. Biotechnol. 14, 603-609. (11)

19 20

Jakubowski, H., and Goldman, E. (1992) Editing of errors in selection of amino acids for protein synthesis, Microbiol. Rev. 56, 412-429.

(12)

21 22

Link, A. J., Mock, M. L., and Tirrell, D. A. (2003) Non-canonical amino acids in protein

Kramer, E. B., and Farabaugh, P. J. (2007) The frequency of translational misreading errors in E. coli is largely determined by tRNA competition, RNA 13, 87-96.

(13)

Kotik-Kogan, O., Moor, N., Tworowski, D., and Safro, M. (2005) Structural basis for

23

discrimination of L-phenylalanine from L-tyrosine by phenylalanyl-tRNA synthetase, Structure

24

13, 1799-1807.

25

(14)

26 27

mistranslation, Nat. Rev. Microbiol. 8, 849-856. (15)

28 29

(16)

34

Rozov, A., Demeshkina, N., Westhof, E., Yusupov, M., and Yusupova, G. (2016) New Structural Insights into Translational Miscoding, Trends Biochem. Sci. 41, 798-814.

(17)

32 33

Jones, T. E., Alexander, R. W., and Pan, T. (2011) Misacylation of specific nonmethionyl tRNAs by a bacterial methionyl-tRNA synthetase, Proc. Natl. Acad. Sci. U. S. A. 108, 6933-6938.

30 31

Reynolds, N. M., Lazazzera, B. A., and Ibba, M. (2010) Cellular mechanisms that control

Ling, J., Reynolds, N., and Ibba, M. (2009) Aminoacyl-tRNA synthesis and translational quality control, Annu. Rev. Microbiol. 63, 61-78.

(18)

Ling, J., Roy, H., and Ibba, M. (2007) Mechanism of tRNA-dependent editing in translational quality control, Proc. Natl. Acad. Sci. U. S. A. 104, 72-77.

33

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

(19)

2 3

Zhang, Z., Shah, B., and Bondarenko, P. V. (2013) G/U and certain wobble position mismatches as possible main causes of amino acid misincorporations, Biochemistry 52, 8165-8176.

(20)

Borisov, O. V., Alvarez, M., Carroll, J. A., and Brown, P. W. (2015) Sequence Variants and

4

Sequence Variant Analysis in Biotherapeutic Proteins, In State-of-the-Art and Emerging

5

Technologies for Therapeutic Monoclonal Antibody Characterization Volume 2.

6

Biopharmaceutical Characterization: The NISTmAb Case Study, pp 63-117, American Chemical

7

Society.

8

(21)

9 10

Harris, R. P., and Kilby, P. M. (2014) Amino acid misincorporation in recombinant biopharmaceutical products, Curr. Opin. Biotechnol. 30, 45-50.

(22)

Netzer, N., Goodenbour, J. M., David, A., Dittmar, K. A., Jones, R. B., Schneider, J. R., Boone,

11

D., Eves, E. M., Rosner, M. R., Gibbs, J. S., Embry, A., Dolan, B., Das, S., Hickman, H. D.,

12

Berglund, P., Bennink, J. R., Yewdell, J. W., and Pan, T. (2009) Innate immune and chemically

13

triggered oxidative stress modifies translational fidelity, Nature 462, 522-526.

14

(23)

Kramer, E. B., Vallabhaneni, H., Mayer, L. M., and Farabaugh, P. J. (2010) A comprehensive

15

analysis of translational missense errors in the yeast Saccharomyces cerevisiae, RNA 16, 1797-

16

1808.

17

(24)

Ling, J., and Soll, D. (2010) Severe oxidative stress induces protein mistranslation through

18

impairment of an aminoacyl-tRNA synthetase editing site, Proc. Natl. Acad. Sci. U. S. A. 107,

19

4028-4033.

20

(25)

Feeney, L., Carvalhal, V., Yu, X. C., Chan, B., Michels, D. A., Wang, Y. J., Shen, A., Ressl, J.,

21

Dusel, B., and Laird, M. W. (2013) Eliminating tyrosine sequence variants in CHO cell lines

22

producing recombinant monoclonal antibodies, Biotechnol. Bioeng. 110, 1087-1097.

23

(26)

Khetan, A., Huang, Y. M., Dolnikova, J., Pederson, N. E., Wen, D., Yusuf-Makagiansar, H.,

24

Chen, P., and Ryll, T. (2010) Control of misincorporation of serine for asparagine during

25

antibody production using CHO cells, Biotechnol. Bioeng. 107, 116-123.

26

(27)

27 28

Genet. 47, 121-137. (28)

29 30

33

Ribas de Pouplana, L., Santos, M. A., Zhu, J. H., Farabaugh, P. J., and Javid, B. (2014) Protein mistranslation: friend or foe?, Trends Biochem. Sci. 39, 355-362.

(29)

31 32

Pan, T. (2013) Adaptive translation as a mechanism of stress response and adaptation, Annu. Rev.

Bullwinkle, T. J., and Ibba, M. (2016) Translation quality control is critical for bacterial responses to amino acid stress, Proc. Natl. Acad. Sci. U. S. A. 113, 2252-2257.

(30)

Hutterer, K. M., Zhang, Z., Michaels, M. L., Belouski, E., Hong, R. W., Shah, B., Berge, M., Barkhordarian, H., Le, E., Smith, S., Winters, D., Abroson, F., Hecht, R., and Liu, J. (2012)

34

ACS Paragon Plus Environment

Page 34 of 38

Page 35 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

Targeted codon optimization improves translational fidelity for an Fc fusion protein, Biotechnol.

2

Bioeng. 109, 2770-2777.

3

(31)

Rasmussen, B., Davis, R., Thomas, J., and Reddy, P. (1998) Isolation, characterization and

4

recombinant protein expression in Veggie-CHO: A serum-free CHO host cell line,

5

Cytotechnology 28, 31-42.

6

(32)

Ren, D., Pipes, G. D., Liu, D., Shih, L. Y., Nichols, A. C., Treuheit, M. J., Brems, D. N., and

7

Bondarenko, P. V. (2009) An improved trypsin digestion method minimizes digestion-induced

8

modifications on proteins, Anal. Biochem. 392, 12-21.

9

(33)

10 11

optimal ion identification, J. Am. Soc. Mass. Spectrom. 23, 1400-1407. (34)

12 13

Zhang, Z. (2012) Automated precursor ion exclusion during LC-MS/MS data acquisition for Zhang, Z. (2009) Large-scale identification and quantification of covalent modifications in therapeutic proteins, Anal. Chem. 81, 8354-8364.

(35)

Manickam, N., Nag, N., Abbasi, A., Patel, K., and Farabaugh, P. J. (2014) Studies of translational

14

misreading in vivo show that the ribosome very efficiently discriminates against most potential

15

errors, RNA 20, 9-15.

16

(36)

Caserta, E., Liu, L.-C., Grundy, F. J., and Henkin, T. M. (2015) Codon-anticodon recognition in

17

the Bacillus subtilis glyQS T box riboswitch: RNA-dependent codon selection outside the

18

ribosome, J. Biol. Chem. 290, 23336-23347.

19

(37)

20 21

Pan, B., and Sundaralingam, M. (1999) Mismatched base pairing in RNA crystal structures, Int. J. Quantum Chem. 75, 275-287.

(38)

Raina, M., Moghal, A., Kano, A., Jerums, M., Schnier, P. D., Luo, S., Deshpande, R.,

22

Bondarenko, P. V., Lin, H., and Ibba, M. (2014) Reduced amino acid specificity of mammalian

23

tyrosyl-tRNA synthetase is associated with elevated mistranslation of Tyr codons, J. Biol. Chem.

24

289, 17780-17790.

25

(39)

Yu, X. C., Borisov, O. V., Alvarez, M., Michels, D. A., Wang, Y. J., and Ling, V. (2009)

26

Identification of codon-specific serine to asparagine mistranslation in recombinant monoclonal

27

antibodies by high-resolution mass spectrometry, Anal. Chem. 81, 9282-9290.

28

(40)

Carrillo-Cocom, L. M., Genel-Rey, T., Araiz-Hernandez, D., Lopez-Pacheco, F., Lopez-Meza, J.,

29

Rocha-Pizana, M. R., Ramirez-Medrano, A., and Alvarez, M. M. (2015) Amino acid

30

consumption in naive and recombinant CHO cell cultures: producers of a monoclonal antibody,

31

Cytotechnology 67, 809-820.

32

(41)

Ahn, W. S., and Antoniewicz, M. R. (2013) Parallel labeling experiments with [1, 2-13 C]

33

glucose and [U-13 C] glutamine provide new insights into CHO cell metabolism, Metab. Eng. 15,

34

34-47.

35

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Amino acid misincorporation propensities 1

(42)

Altamirano, C., Paredes, C., Cairó, J. J., and Gòdia, F. (2000) Improvement of CHO Cell Culture

2

Medium Formulation: Simultaneous Substitution of Glucose and Glutamine, Biotechnol. Progr.

3

16, 69-75.

4

(43)

Rozov, A., Westhof, E., Yusupov, M., and Yusupova, G. (2016) The ribosome prohibits the G*U

5

wobble geometry at the first position of the codon-anticodon helix, Nucleic Acids Res. 44, 6434-

6

6441.

7

(44)

Varani, G., and McClain, W. H. (2000) The G x U wobble base pair. A fundamental building

8

block of RNA structure crucial to RNA function in diverse biological systems, EMBO Rep 1, 18-

9

23.

10

(45)

11 12

2464-2469. (46)

13 14

Westhof, E. (2014) Isostericity and tautomerism of base pairs in nucleic acids, FEBS Lett. 588, Rozov, A., Demeshkina, N., Westhof, E., Yusupov, M., and Yusupova, G. (2015) Structural insights into the translational infidelity mechanism, Nat. Commun. 6, 7251.

(47)

Ren, D., Zhang, J., Pritchett, R., Liu, H., Kyauk, J., Luo, J., and Amanullah, A. (2011) Detection

15

and identification of a serine to arginine sequence variant in a therapeutic monoclonal antibody, J.

16

Chromatogr. B 879, 2877-2884.

17

(48)

18 19

Svidritskiy, E., and Korostelev, Andrei A. (2015) Ribosome Structure Reveals Preservation of Active Sites in the Presence of a P-Site Wobble Mismatch, Structure 23, 2155-2161.

(49)

20

Agris, P. F., Vendeix, F. A. P., and Graham, W. D. (2007) tRNA’s Wobble Decoding of the Genome: 40 Years of Modification, J. Mol. Biol. 366, 1-13.

21

(50)

Agris, P. F. (2004) Decoding the genome: a modified view, Nucleic Acids Res. 32, 223-238.

22

(51)

Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M., Pringle, T. H., Zahler, A. M., and

23 24

Haussler, D. (2002) The human genome browser at UCSC, Genome Res. 12, 996-1006. (52)

25 26

Leontis, N. B., Stombaugh, J., and Westhof, E. (2002) The non‐Watson–Crick base pairs and their associated isostericity matrices, Nucleic Acids Res. 30, 3497-3531.

(53)

SantaLucia, J., Jr., Kierzek, R., and Turner, D. H. (1991) Stabilities of consecutive A.C, C.C,

27

G.G, U.C, and U.U mismatches in RNA internal loops: Evidence for stable hydrogen-bonded

28

U.U and C.C.+ pairs, Biochemistry 30, 8242-8251.

29

(54)

Joshi, K., Bhatt, M. J., and Farabaugh, P. J. (2018) Codon-specific effects of tRNA anticodon

30

loop modifications on translational misreading errors in the yeast Saccharomyces cerevisiae,

31

Nucleic Acids Res. doi: 10.1093/nar/gky664.

32 33

(55)

Allner, O., and Nilsson, L. (2011) Nucleotide modifications and tRNA anticodon-mRNA codon interactions on the ribosome, RNA 17, 2177-2188.

36

ACS Paragon Plus Environment

Page 36 of 38

Page 37 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amino acid misincorporation propensities 1

(56)

Das, M., Vargas-Rodriguez, O., Goto, Y., Suga, H., and Musier-Forsyth, K. (2014) Distinct

2

tRNA recognition strategies used by a homologous family of editing domains prevent

3

mistranslation, Nucleic Acids Res. 42, 3943-3953.

4

(57)

Senger, B., Auxilien, S., Englisch, U., Cramer, F., and Fasiolo, F. (1997) The Modified Wobble

5

Base Inosine in Yeast tRNAIle Is a Positive Determinant for Aminoacylation by Isoleucyl-tRNA

6

Synthetase, Biochemistry 36, 8269-8275.

7

(58)

8 9

Wiltrout, E., Goodenbour, J. M., Frechin, M., and Pan, T. (2012) Misacylation of tRNA with methionine in Saccharomyces cerevisiae, Nucleic Acids Res. 40, 10494-10506.

(59)

Wen, D., Vecchi, M. M., Gu, S., Su, L., Dolnikova, J., Huang, Y. M., Foley, S. F., Garber, E.,

10

Pederson, N., and Meier, W. (2009) Discovery and investigation of misincorporation of serine at

11

asparagine positions in recombinant proteins expressed in Chinese hamster ovary cells, J. Biol.

12

Chem. 284, 32686-32694.

13

(60)

14 15

Demeshkina, N., Jenner, L., Westhof, E., Yusupov, M., and Yusupova, G. (2012) A new understanding of the decoding principle on the ribosome, Nature 484, 256-259.

(61)

Freist, W., Sternbach, H., and Cramer, F. (1996) Phenylalanyl-tRNA synthetase from yeast and

16

its discrimination of 19 amino acids in aminoacylation of tRNA(Phe)-C-C-A and tRNA(Phe)-C-

17

C-A(3'NH2), Eur. J. Biochem. 240, 526-531.

18

(62)

19 20

Jakubowski, H., and Fersht, A. R. (1981) Alternative pathways for editing non-cognate amino acids by aminoacyl-tRNA synthetases, Nucleic Acids Res. 9, 3105-3117.

(63)

McClendon, C. L., Vaidehi, N., Kam, V. W. T., Zhang, D., and Goddard, I. I. I. W. A. (2006)

21

Fidelity of seryl-tRNA synthetase to binding of natural amino acids from HierDock first

22

principles computations, Protein Eng. Des. Sel. 19, 195-203.

23

(64)

Manickam, N., Joshi, K., Bhatt, M. J., and Farabaugh, P. J. (2016) Effects of tRNA modification

24

on translational accuracy depend on intrinsic codon–anticodon strength, Nucleic Acids Res. 44,

25

1871-1881.

26 27

37

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 38

Amino acid misincorporation propensities 1 2 3

Amino Acid Misincorporation Propensities Revealed Through Systematic Amino Acid Starvation

4

H. Edward Wong, Chung-Jr Huang, Zhongqi Zhang

For Table of Contents Use Only

5 6 G/U mismatch during codon recognition

Ile-tRNAIle

Ile anticodon Val codon 5’

7 8

5’

UAG GUC

mRNA 3’

Val  Ile

38

ACS Paragon Plus Environment