Increasing Protein Production Rates Can ... - ACS Publications

Jun 26, 2017 - Ajeet K. Sharma and Edward. P. O'Brien. Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802, Un...
0 downloads 0 Views 2MB Size
Subscriber access provided by NEW YORK UNIV

Article

Increasing Protein Production Rates Can Decrease the Rate at Which Functional Protein Is Produced and Their Steady-State Levels Ajeet K. Sharma, and Edward P. O'Brien J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.7b01700 • Publication Date (Web): 26 Jun 2017 Downloaded from http://pubs.acs.org on June 30, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Increasing Protein Production Rates Can Decrease the Rate at Which Functional Protein Is Produced and Their Steady-State Levels Ajeet K. Sharma and Edward. P. O’Brien∗ Department of Chemistry, Pennsylvania State University, University Park, University Park, Pennsylvania 16802, USA

Abstract The rate at which soluble, functional protein is produced by the ribosome has recently been found to vary in complex and unexplained ways as various translation-associated rates are altered through synonymous codon substitutions. To understand this phenomenon, here, we combine a well-established ribosome-traffic model with a master-equation model of cotranslational domain folding to explore the scenarios that are possible for the protein production rate, , and the functional-nascent protein production rate, , as the rates of various translation processes are altered for five different E. coli proteins. We find that while  monotonically increases as the rates of translation-initiation, -elongation and -termination increase,  can either increase or decrease. We show that ’s non-monotonic behavior arises within the model from two opposing trends: the tendency for increased translation rates to produce more total protein but less co-translationally folded protein. We further demonstrate that under certain conditions these non-monotonic changes in  can result in non-monotonic variations in post-translational, steady-state levels of functional protein. These results provide a potential explanation for recent experimental observations that the specific activity of enzymatic proteins can decrease with increased synthesis rates. Additionally our model has the potential to be used to rationally-design transcripts to maximize the production of functional nascent protein by simultaneously optimizing translation initiation, elongation and termination rates.



Corresponding E mail: [email protected]

1 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Translation of an mRNA molecule proceeds through three sequential processes: Initiation, elongation and termination1,2. During initiation a ribosome assembles at the start codon of a transcript. In the elongation phase the ribosome slides along the transcript and synthesizes the protein molecule. And in the termination phase the fully synthesized protein is released after the ribosome has encountered a stop codon (Fig. 1). The rates at which each of these processes occur can be altered through the introduction of synonymous codon mutations into a transcript3. Synonymous mutations change the chemical structure of an mRNA molecule but not the encoded protein’s primary structure. Over the past several decades bioengineers have utilized such synonymous mutations to create transcripts that attempt to maximize the amount of protein produced4,5, while biologists have tried to understand the evolutionary origins of the bias in synonymous codon usage between different organisms6,7. More recently, altering in vivo translation-elongation rates has been found to influence the structure and function of proteins over their lifetime, thereby affecting cellular processes and the phenotype of organisms8–12. For example, in some cases synonymous replacement of rare codons with optimal ones decreased the steady-state level of functional protein9,13, while in other cases such mutations have had the opposite effect4,14, presumably due to changes in co-translational folding. This growing appreciation for the importance of translation kinetics to nascent protein behavior15,16 raises a number of fundamental questions, including whether increasing protein production rates increases the rate of functional nascent-protein production, which class of translation rates (initiation, elongation or termination) has the greatest impact on a protein’s structure and function, and what this might tell us about mRNA sequence evolution. A functional nascent protein is a protein that has just completed synthesis and has one or more of its biologically-active domains folded and capable of functioning. This is in contrast to a functional mature protein that has long since been synthesized and is functional. A nonfunctional nascent-protein can post-translationally convert into a functional protein provided it does not aggregate or get degraded first. Thus, the rate of functional nascent-protein production  is equal to the rate at which domains (that do not require post-translational modifications) cotranslationally fold by the time the protein’s synthesis is completed. Several studies have examined the effect of changing translation rates on steady-state levels of functional protein17,18. Despite this, no theoretical model has been developed that can describe the influence of translation-initiation, -elongation and -termination rates on the process of co-translational domain folding and can reflect, within a single model, the broad range of observed behaviors. Here, we do this by combining the well-established Totally Asymmetric Simple Exclusion Process (TASEP) model19–21 with a master equation description of cotranslational domain folding to simultaneously account for these various rates associated with translation and to address these fundamental questions. We show that a non-monotonic variation in  can occur when the nascent-protein production rate  is increased, and that this can result in a non-monotonic variation in the steady-state concentration of functional protein when post-

2

ACS Paragon Plus Environment

Page 2 of 25

Page 3 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

translational processes are also included in the model. These results, along with the analytical expressions of  and  that we provide, can be used to rationally design mRNA sequences that maximize the production of functional nascent protein.

Methods

Analytical expression for the rate of functional nascent-protein production, . We provide an analytical expression for the functional nascent-protein production rate  using a combination of the ℓ-TASEP model22 with a master equation description of co-translational domain folding. The ℓ-TASEP model provides a mathematical expression for the rate of protein production , whereas solving a master equation allows us to calculate the probability of co-translational domain folding. ℓ-TASEP is one of the most realistic, analytically-tractable members of the TASEP family of models that describes ribosome movement along a transcript during protein synthesis22. In the ℓ-TASEP model a ribosome covers ℓ successive codon positions of an mRNA molecule. Initiation events occur with intrinsic rate  provided the first five codon positions after the start codon of the mRNA sequence are not occupied by another ribosome (Fig. 1). The ribosome then undergoes unidirectional stochastic steps one codon at a time along the transcript, elongating the nascent-chain one amino acid at a time. Each such step occurs with intrinsic rate (), when the ribosomal A-site is at the  codon position, provided no downstream ribosome blocks the next codon position. When the ribosome encounters a stop codon this elongation process is terminated with rate . Ribosome traffic on an mRNA transcript in the simple homogenous ℓ-TASEP model (i.e., uniform codon translation rates) is categorized into low density (L.D.), high density (H.D.), or maximal current (M.C.) regimes22 (Fig. 2A). The L.D. phase is characterized by a low density of ribosomes on the transcript, while the H.D. phase corresponds to a high density of ribosomes. (That is, a high or low , respectively.) The M.C. phase has an intermediate density of ribosomes (specifically, = 0.076 when ℓ=10) and also maximizes the rate of protein production22. The mean-field expressions22 for the steady-state rate of protein production, , and ribosome density on the transcript, , for these three regimes are reported in Fig. 2A. The protein production rate  is the rate at which fully synthesized proteins are released from copies of an mRNA transcript, whereas is the average number of ribosomes actively translating a transcript divided by the number of codons in that transcript’s coding sequence. Recently, we derived an exact solution to a master equation that describes the steady-state probability  () that a domain will populate its folded state (N) at nascent-chain length  during translation elongation23. We implement it here and decrease the number of free parameters by invoking four assumptions whose justifications are provided in Ref. 15. These assumptions simplify the model’s calculations and allow us to derive a mathematical expression for . The first assumption is that the ribosome translates the mRNA sequence at a uniform intrinsic translation-elongation rate . (Note that the homogenous ℓ-TASEP model also makes this assumption.) Second, a domain within a nascent-protein cannot fold before it emerges from the ribosome exit tunnel. Third, after emerging from the ribosome exit tunnel, the folding kinetics of 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 25

the protein domain are bulk-like and remain independent of the nascent-chain length. Finally, the domain is assumed to only populate an unfolded (U) or folded state (N). Under these assumptions  () =

 

 1 −

   

! "##

 $

  ! "##  

%&'( &')

*,

[1] 1234 1234 for  ≥ ,- + , , otherwise  () =0. In Eq. 1, /0 and /0 are, respectively, the rates of interconversion of the domain off the ribosome from its unfolded to folded state, and from the folded to the unfolded state. ,- is the codon position of the most C-terminal residue of the protein domain of interest (Fig. 2B) and , is the number of residues that can fit inside the ribosome exit tunnel (Fig. 2C). The intrinsic codon translation rate  has been subsumed in Eq. 1 into an effective codon translation rate 566 ≡ 8[24,25]. 566 accounts for the increase in the time it takes a ribosome to translate a codon due to the excluded-volume interactions between 9&:ℓ

neighboring ribosomes on a transcript. For a transcript with ribosome density , 8 = 9:&:ℓ is

the probability that there is no ribosome at codon position ( + ℓ) if  is the most 5′ codon position occupied by a ribosome. We combine the main result of ℓ-TASEP (i.e., the expressions for , Fig. 2A) with the main result of the master-equation model (i.e., Eq. 1) to calculate the rate, , of synthesis of cotranslationally folded protein domains as [2]  =  ( = stop), where  ( = stop) is the steady-state probability that a domain will populate its folded state at the stop codon of the transcript.  is equal to the rate of functional nascent protein production for cytosolic proteins that do not require post-translational modifications. Note that  and  both are functions of , and , while  ( = stop) is a function of , and  through 566 .

Proteins modeled and their parameters. To illustrate the different scenarios that can arise for  and , we applied Eq. 2 to five different two-domain E. coli cytosolic proteins translated from genes DHAS_ECOLI, E4PD_ECOLI, G3P1_ECOLI, SYY_ECOLI and PGK_ECOLI (Fig. 3). These five proteins were chosen because their N-terminal domain folding time is similar to the protein synthesis time, suggesting that the folding of these domains might be susceptible to changes in codon translation speed. Specifically, we compared the folding times of the Nterminal domains of all two-domain E. coli proteins with the average time a ribosome would take to synthesize the rest of the protein after the N-terminal. The mean folding time of the Nterminal domains were taken from Ref. 16; while the synthesis time was estimated as the sum of codon translation times reported in Ref. 16 for all 61 sense codons26 . While we examined five here, the total number of two-domain E. coli proteins showing such a ratio of timescales is much more. We computed  and  for these five E. coli proteins by varying the codon translation rates  between the physiologically relevant values of 0.001 to 30.000 @ &9 [8,15]. Here we assume that 4

ACS Paragon Plus Environment

Page 5 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

each of the codons in these five mRNA transcripts are translated with a uniform codon translation rate. Later, we remove this restriction and model codon specific translation rates in a more realistic manner. To create the L.D. and H.D. regime, respectively, we varied  and between 0.001 to 2.000 @ &9 [27,28] and 0.001 to 5.000 @ &9 . Note that the mathematical expressions of  (Eq. 2) in the L.D. regime doesn’t depend on , therefore we did not assign a numerical value to in this regime. Similarly, we do not assign a numerical value to  in the H.D. regime. 1234 1234 /0 and /0 for the N-terminal domain of these proteins were taken from the protein database 1234 1234 reported in Ref. 16, which were calculated using the PREFUR algorithm29. /0 , /0 and the codon position of the most C-terminal residue of the N-terminal domain (,- ) for these five proteins are listed in Table S1. , , the number of residues that can fit into the ribosome exit tunnel is 30[30]. Ribosome profiling experiments31 have demonstrated that a ribosome typically covers around 28 to 30 nucleotides from nuclease digestion, therefore ℓ was set to 10 codons. Calculating  and  numerically for a more realistic model. The four assumptions invoked in the derivation of Eq. 2 simplified the model and allowed us to derive an analytical solution for . It is possible, however, that our conclusions are only valid when all the model’s assumptions are met. Therefore we tested the robustness of our main results by violating the first three assumptions used in the derivation of Eq. 2. (We also violate the fourth assumption later in the Results Section.) We violated the first assumption by using the non-uniform codon translation rates estimated for E. coli (see Table S1 in Ref. 16). To violate the second and third assumptions we used a previously developed model16 that estimates the domain folding and unfolding rates as a function of nascent chain length during synthesis. This model allows the most N-terminal domain to fold inside the ribosome exit tunnel, which violates the second assumption of Eq. 2, while its position dependent folding and unfolding rates violates the third assumption. We simulated N-terminal domain folding using a Monte-Carlo algorithm32 and thereby calculated  and . Translation was first allowed to reach steady-state in the simulations, and  was calculated by counting the number of simulated termination events in which the N-terminal domain was in the folded state and dividing this number by the total time over which the translation process was simulated. Similarly,  was calculated by counting the average number of termination events per second. In these numerical simulations we set  = 0.083 @ &9[27] and =35.000 @ &9 [28] to simulate translation in the L.D.-like regime, which is expected to be the dominant regime in vivo33. We varied the average translation speed in the simulations by multiplying the translation rates of each codon from Ref. 16 by a constant ranging between 0.015 to 2. This allowed us to change the average speed of elongation without affecting the relative variation in individual codon translation rates. To simulate translation in an H.D.-like regime we set  = 1.0 @ &9 and let vary between 0.05 to 3.00 @ &9 . Calculating the steady-state concentration of functional protein. The effect of the translationinitiation, -elongation and -termination rates on  might also influence the steady-state 5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 25

concentration of folded, functional proteins8. The steady-state concentration of functional protein can be critical in influencing the phenotype of an organism. Therefore, to understand how the rates at which folded (i.e., functional) and unfolded proteins are produced affect the steady-state concentration of folded protein, we extended the kinetic model used to derive Eq. 1 to include the post-translational processes of folding, aggregation and degradation34 (Fig. 1). In Fig. 1, 0 and  are the rates, in units of nM @ &9 , at which a given protein is released in the unfolded or folded states, respectively. We numerically calculate the rate of release of unfolded (= − ) and folded (=) protein molecules per mRNA molecule from numerical simulations, which we convert into the concentration-dependent rates 0 and  , respectively. To do this we assumed the typical E. coli cell volume of 4 fL[35]. With this volume, one protein molecule contributes 0.4 nM to the total cellular protein concentration. For each protein we assumed that only a single copy of its transcript is being actively translated, which is typical in E. coli36. The unfolded domain of the released protein can either interconvert to the folded 1234 G1H , /EFF , /0I , respectively (Fig. 1). Similarly, folded state, aggregate or degrade with rates /0

1234 proteins can either unfold with rate /0 or degrade with rate /I . The aggregated protein can 1234 1234 also degrade with rate /JI . In this model /0 and /0 are the same as their co-translational counterparts. The degradation rates for the unfolded, folded and aggregated proteins are 8× 10&L, 4× 10&L and 4× 10&L @ &9 [37]. We varied the aggregation rate between 5 × 10&9 to G1H is a pseudo-first-order 5 × 10&L NO &9 @ &9[38]. In this model, the aggregation rate constant /EFF rate constant that depends upon the concentration of the given protein in the unfolded state, i.e., G1H /EFF = /EFF [Q]. The time evolution of the protein concentrations in the unfolded, folded, and aggregated state (denoted by [Q], [S] and [T], respectively) for the kinetic scheme shown in Fig. 1 can be determined by the following rate equations. U[V]

[3]

[5]

1234 1234 G1H = 0 − [Q]X/0 + /EFF + /0I Y + [S]/0

UW



U[Z] UW

1234 1234 = [ + [Q]/0 − [S](/0 + /I ) U[\] UW

[4]

G1H = [Q]/EFF − [T]/JI

We solved Eqs. 3-5 numerically and calculated the steady-state concentration of functional protein, denoted by [S]]] .

Results Increasing protein production rates can decrease the rate of functional nascent-protein production. Using Eq. 2 we address the first question - does increasing the rate of protein production  always increase the rate of functional nascent protein production ? We answer this 6

ACS Paragon Plus Environment

Page 7 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

question by applying the first-derivative test for monotonicity to  with respect to , and . We take the derivative of  and  in each regime separately and report the results in the Supporting Information (SI). We find that the derivatives of  with respect to , and  are always positive or zero (Eqs. S1-S3) meaning that  monotonically increases or is constant with increases in , , or . On the other hand,  exhibits non-monotonic behavior in the L.D. and H.D. phases (Eqs. S10-S14) – that is, its derivatives can be positive or negative - and monotonic behavior in the M.C. phase (Eq. S4). These results, summarized in Table S2, mean that there are scenarios in our model in which increasing protein production rates can decrease the rate at which folded, functional nascent proteins are produced. To illustrate these different scenarios in real proteins, and gain insight into their molecular origin, we apply Eq. 2 to five two-domain E. coli proteins whose details are provided in the Methods section. Specifically, we focus on their N-terminal domain folding. We find that increasing the translation-elongation rate  increases the rate of protein production  in the L.D. regime (data not shown). However, increasing the elongation rate  gives rise to non-monotonic variation in  for these five proteins (Fig. 4A). Similarly, plotting  versus  reveals that above a certain value of , increasing the elongation rate further has the effect of increasing  but decreasing  (Fig. 4B). The reason for this is that at those higher translation-elongation rates the protein has less time to fold on the ribosome, as illustrated by the decrease in  ( = stop) with increasing  (Fig. S1(A)). These two opposing effects of speeding up elongation but decreasing the folding probability of the domain give rise to the non-monotonic variation in the rate of production of the functional protein. Increasing  in the L.D. regime, however, results in a monotonic increase of both  and  (Fig. S2). The reason  does not exhibit a turnover in this case is that an increase in  increases the ribosome density on mRNA transcript (Fig. 2A). This decreases the effective translationelongation rate 566 due to frequent ribosome traffic jams on the transcripts (Fig. 1), thus providing more time for domain folding at each codon position (Fig. S1(B)). Therefore, both  and  ( = stop) increase with increasing  (Fig. S2), resulting in a monotonic increase in . (Note that Fig. S2(A) exhibits a regime change to the M.C. phase when  becomes greater than ! (Fig. 2). In the M.C. phase,  is independent of the rate  (Fig. 2A). Therefore, any further 9√ℓ

increase in  does not increase  as seen in Fig. S2(A).) In the H.D. phase, the rate at which functional protein is produced depends only upon and  (Fig. 2A). Here, in contrast to the L.D. phase, increasing  results in a monotonic increase in  (data not shown). The reason for this is that in the H.D. phase the upper bound of 566 = `ℓ

a 9 (ℓ&9) b

is ℓ. In other words, due to the steric hindrance between ribosomes in this high density

regime, increasing the intrinsic codon translation rate  cannot increase the effective codon translation rate above 566 = ℓ. Therefore, the probability of a domain being in the folded state at the time of nascent chain release cannot decrease beyond a certain value even if  continues to increase (Fig. S3(A)). Such a limited decrease in  ( = stop) is not sufficient to cause a

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

turnover in  because , the other contributor to , increases at a much faster rate than the decrease in  ( = stop). In the H.D. regime, increasing the termination rate gives rise to non-monotonic variation in  (Fig. 5A). The reason for this is that increasing in this regime decreases the

density of ribosomes on the transcript (N.B., =

9&`/! ℓ

, Fig. 2A). This reduces the frequency of

ribosome traffic-jams thereby increasing the effective translation-elongation rate 566 . Increasing 566 decreases the probability  ( = stop) of successful domain folding by the end of termination (Fig. S3(B)) because there is less time for the domain to fold during synthesis. Increases in however increase  (Fig. 2A). These two competing effects are responsible for the non-monotonic variation of  with (Fig. 5A) and  versus  (Fig. 5B). (Note that in Fig. 5A ! increasing beyond the value of 9√ℓ shifts the ribosome traffic from the H.D. regime to the

M.C. regime where the translation kinetics is independent of (Fig. 2).) In the M.C. phase,  depends only upon the translation-elongation rate  (Fig. 2A). We find that increasing  increases  monotonically (Fig. S4), but further increases in  result in a transition into the L.D. regime, where, as described previously, increasing  can decrease . It is worth noting that even when  = 0.7 @ &9 in Fig. S4, as compared to Fig. 4 in which  = 0.083 @ &9 , we still observe non-monotonic behavior in  in the L.D. regime. This indicates that this phenomenon can be observed even with order-of-magnitude changes in the initiation rate. In summary, we have found that increasing the protein production rate in L.D. and H.D. regimes by increasing  and , respectively, can decrease the production of functional nascentprotein. However, increasing translation-elongation rate  in the M.C. regime monotonically increases the production of functional nascent protein. M.C. phase maximizes the production of functional protein. To address the second question which class of rates maximize  at physiologically relevant rates - we examine the predicted behaviors in Figs. 4, 5, S2 and S4. We immediately see that to maximize  in the L.D. phase, increasing  instead of  is best because  monotonically increases with . On the other hand, in the H.D. regime,  increases monotonically with  for the rates we tested (data not shown), while exhibiting non-monotonic variation with changes in (Fig. 5). Therefore, experimentally modulating  may be a better approach in the H.D. regime because it is not possible to overshoot the maximum in . And in the M.C. phase, increasing  is the only variable that can increase . Translation in the M.C. regime maximizes , and, unlike the L.D. phase, no turnover in  occurs in the M.C. phase. Therefore, it seems likely that translation in the M.C. phase globally maximizes , suggesting that codon optimization strategies should attempt to achieve this translational regime. These results also suggest that evolutionary selection pressures may have different strategies to maximize the functional protein produced depending upon on which regime translation is in.

8

ACS Paragon Plus Environment

Page 8 of 25

Page 9 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

A more realistic model exhibits the same scenarios. The assumptions that make our model analytically tractable are approximations. To test whether the main conclusions of this study are robust to these details we violated three out of the four assumptions underlying Eq. 2 and calculated  and  using Monte-Carlo simulations of the translation of the five two-domain E. coli proteins (see Methods). To create the L.D.-like regime we set  = 0.083 @ &9[27] and = 35.0 @ &9[28], and varied the average translation-elongation rate form 0.27 to 30 @ &9 while simultaneously preserving the variation in individual codon translation rates (Methods). Similarly, the H.D.-like regime was created by setting  = 1.0 @ &9 and varying between 0.05 to 3 @ &9 . In this more realistic model - in which there is variation in codon translation rates, domain folding can occur inside the exit tunnel, and altered co-translational domain folding rates relative to bulk solution - we find that  still exhibits non-monotonic behavior with changes in  (Fig. 6) and (Fig. 7) in the L.D. and H.D.-like regimes, respectively. Thus, the antagonism we have found between the rate of synthesis and folding influencing the rate of functional nascent protein production is robust to changes in the model assumptions, suggesting that this may be a realistic scenario for nascent proteins. In the absence of any experimentally-derived knowledge of intermediate or misfolded states that can occur in these five proteins, we violated the fourth assumption of two-state domain folding by introducing an artificial off-pathway misfolding state. The N-terminal domain is defined to transition from the unfolded to this misfolded state with rate /0e () and the reverse transition occurs with rate /e0 () at nascent-chain-length j. We set /0e () = /0[ ()/10.0 and /e0 () was chosen such that the equilibrium population of the misfolded state at each nascentchain length was 2%. We ran ℓ-TASEP simulations of this model and found that the nonmonotonicity we observed in Figs. 6 and 7 remain robust to the incorporation of misfolding (Figs. S5 and S6). The variance in codon translation rates in this model is only determined by the tRNA concentrations. Many additional factors, however, also influence these rates39–42. Therefore, the true variance in codon translation rates may be even larger. We anticipate that the impact of increasing this variance (while holding the average constant) would be to cause even more bottlenecks for ribosome traffic than we currently observe in our simulations. This would arise from an increase in the difference in translation speed between the slowest codons and the average speed. These new bottlenecks would increase the average synthesis time of the transcript and provide more time for a protein domain to fold co-translationally. The effect being that the  and values that maximize  (Figs. 4 and 5), in the L.D. and H.D. regimes, would shift towards higher values. Increasing the rate of protein production can decrease the steady-state concentration of functional protein. Most experiments that have measured the influence of translation kinetics on functional protein levels8,43,44 do so of mature proteins, not nascent proteins. The steady-state levels of these proteins are influenced by , the rates of protein aggregation and degradation, and post-translational protein folding. Increases in the translation-elongation rate  in the L.D. 9

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 25

regime or the termination rate in the H.D. regime can result in the non-monotonic variation in  (Figs. 4 and 5). However, such variation in  might not result in non-monotonic behavior in the steady-state levels of the folded, functional protein because a protein released in the unfolded state will eventually fold if it is not degraded or aggregated first. Therefore, we asked if the nonmonotonic behavior we observed in  can result in non-monotonic behavior in the steady-state levels of functional protein. To do this we extended our model to allow for these posttranslational processes (Fig. 1) and solved for the steady-state concentration of the folded protein [S]HH using physiologically relevant aggregation and degradation rates (see Methods for details). The steady-state concentrations of the functional proteins were plotted against  in the L.D.-like regime (Figs. 8(A), (C) and (E)), and against in H.D.-like regime (Figs. 8(B), (D) and (F)). We find that non-monotonic behavior in  can result in non-monotonicity in [S]]] for these five proteins when the aggregation rate is between 5 × 10&9 to 5 × 10&f NO &9 @ &9. This behavior occurs because the increased rate of synthesis produces more nascent protein per unit time that are unfolded, which in turn increases the flux of nascent proteins into aggregated or degraded states that are non-functional. Further decrease in the aggregation rate to 5 × 10&L NO &9 @ &9 results in a loss of non-monotonic behavior in [S]]] as the fraction of aggregated protein decreases significantly at those lower aggregation rates (Fig. S7). Thus, under these conditions, the steady-state level of functional protein (Fig. 8) can decrease when the rate of protein production is increased.

Discussion Changing translation-initiation, -elongation or -termination rates through synonymous codon substitutions45 can alter the rate at which functional nascent-proteins are produced6,7 and influence the steady-state concentration of protein in vivo, thereby affecting the phenotype of an organism. In this study, we have combined mathematical expressions for the rate of protein production22 with an expression for the probability of domain-wise co-translational folding23 that allows us to compute the rate of functional nascent-protein production. With this combined model we have examined how changing the underlying rates of translation can influence the rate at which functional protein is produced, and their steady-state levels in cells. We find that regardless of the complexity of the model, there exist scenarios in which increasing effective codon translation rates increases the rate of protein production, but decreases the rate of functional protein production. We see this scenario arise when the ribosome density per transcript is either very high or low. In the low-density regime increasing the intrinsic elongation rate increases the speed with which ribosomes slide along a transcript, but provides less time for individual nascent-protein domains to fold during their synthesis. This decreases the functional protein production rate. In the high-density regime, increasing the translation termination rate increases the effective codon translation rates because it relieves ribosome traffic jams on the transcript, allowing individual ribosomes to move more quickly from the 5´ to 3´ end of the coding sequence. This again has the effect of decreasing protein folding and decreasing . Thus, in both of these scenarios it is the antagonism between increasing the 10

ACS Paragon Plus Environment

Page 11 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

effective translation-elongation rates and decreasing co-translational protein folding that results in an increase in protein production rates but decrease in functional protein production. We demonstrated that when the post-translational processes of protein folding, aggregation and degradation are included in our model the non-monotonic behavior in  can lead to non-monotonic changes in the steady-state concentration of functional protein. That is, increasing protein production rates can decrease how much functional protein is present in a cell. The reason for this is that increasing protein production rates tend to create more unfolded nascent proteins that are more likely to aggregate or be degraded. The scenarios we have identified can provide an explanation for a number of experimental observations. An in vitro study, for example, found that increasing the translationelongation rate by making synonymous substitutions into the wild-type chloramphenicol acetyltransferase (CAT) gene increased the amount of protein produced by 16% but decreased the relative specific activity of the protein by 20%[43]. Our result would suggest this is due to less co-translational folding occurring. Similar observations have been made for other proteins as well9,12,44. To be sure, the observation that increasing codon translation rates can either increase or decrease the steady-state level of functional proteins has been reported in the literature4,5,17. It is less well understood, however, why, when codon translation rates are increased, the level of functional protein increases for some proteins but decreases for others. Our model suggests this could arise in two different ways: if translation of the former transcript occurs in the L.D. regime while translation of the latter transcript occurs either in the M.C. or in the H.D. regimes; or, if both transcripts are in the L.D. regime, but the former has an  below the optimal value while the latter has a  above the optimal value (Figs. 4 and 6). Our models have shown that ribosome traffic can significantly alter the co-translational folding of proteins, and predicted that novel scenarios can arise. For example, the synthesis time of proteins increased by two orders of magnitude due to steric interactions between ribosomes in the H.D. regime (Fig. S3(B)). This provided more time for the protein domains to cotranslationally fold, and thereby significantly altered  and . Moreover, the model indicates that it is possible to decrease the rate of production of functional protein by increasing the termination rate in the H.D. regime, even though the overall rate of protein synthesis increases. While the translation-termination rate is often assumed not to be rate limiting in vivo, the aforementioned scenario can arise if a slow-translating codon is located near the 3′ end of the coding sequence of the transcript. Increasing the translation rate of this slow codon would decrease the total translation time (due to reduced ribosome traffic-jams on the transcript) and significantly impact the production of functional protein and its steady-state level. Changes in the rate at which functional protein is produced and their steady-state level could have global consequences for cells. For example, our model predicts that the fraction of functional tyrosine-tRNA ligase, the protein product of gene SYY_ECOLI that recharges tRNA with tyrosine46, can decrease if translation elongation or termination is increased. Less functional

11

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 25

ligase could slow down tRNA recharging, and thereby have global effects on translation control of gene expression. Our model can be used to design the mRNA sequences that maximize the production of functional proteins. Algorithms exist to design mRNA sequences to separately modulate  and 4,47,48, but no design algorithm exists, to our knowledge, to simultaneously optimize the values of , , and  to maximize the production of functional nascent protein. The derivatives (Eqs. S4-S14) of Eq. 2 allow us to do this. For example, for a given  and , the value of  that maximizes  in derivative S10 can be numerically solved. Moreover, unlike many conventional codon optimization methods4,5, our chemical-kinetic-based model doesn’t rely on empirical measurements of codon optimality, but instead accounts for the underlying microscopic rates of domain-wise protein folding. Thus, our model lays the groundwork for new bioengineering design capabilities, especially as the ability to predict translation rates from mRNA sequences becomes more accurate. Quantitative predictions made by our model are subject to limitations that arise from inaccuracies in our estimated rates of domain folding and translation. For example, the codon translation rates we have used assumes tRNA concentration is the major determinant of codon translation rates26. However, other molecular factors, such as mRNA structure45, proline residues46,47 and positively charged amino acids48–50, are known to modulate codon translation rates. The model also ignores the effect of chaperones that bind either co-translationally or posttranslationally38,51, which can affect the folding and aggregation rates of proteins and thus  as well. In addition, changing initiation and elongation rates independently of each other is a practical challenge for designing mRNA sequences since the initiation rate of a transcript also depends upon the codon composition of the few N-terminal codons which affects the average gene translation rates52. Moreover, limited cellular recourses53,54 such as a finite number of charged tRNA molecules and ribosomes, can put additional restrictions on the possible values of  and , especially in the case of heterologous gene expression where over expression of a transcript is commonly employed. Therefore, varying codon translation rates in a heterologously expressed gene could have global consequences on  and  that are not accounted for in our model. In summary, we have found that increasing the rate of protein production can be detrimental to producing functional protein. This arises because increasing elongation and termination rates increases protein production rates but can end up providing less time for nascent protein domains to correctly co-translationally fold, thereby decreasing the rate of production and steady-state levels of soluble, functional protein. Our model allows for the simultaneous optimization of translation-initiation, -elongation and -termination rates to maximize the rate of functional protein production and therefore has the potential to be developed into a new approach for the rational-design of mRNA transcripts. More broadly, these results support the emerging paradigm that translation kinetics can play a critical role in determining the behavior of proteins in cells17.

12

ACS Paragon Plus Environment

Page 13 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Supporting Material Supplementary Results, two tables and seven figures.

Author Contributions A.K.S and E.P.O. designed research, analyzed data and wrote the manuscript. A.K.S. performed the simulations.

Acknowledgements This work was supported by an HFSP research grant RGP0038/201 and an NSF CAREER grant MCB-1553291.

References (1)

Marshall, R. A.; Aitken, C. E.; Dorywalska, M.; Puglisi, J. D. Translation at the SingleMolecule Level. Annu. Rev. Biochem. 2008, 77, 177–203.

(2)

Chowdhury, D. Stochastic Mechano-Chemical Kinetics of Molecular Motors: A Multidisciplinary Enterprise from a Physicist’s Perspective. Phys. Rep. 2013, 529, 1–197.

(3)

Sauna, Z. E.; Kimchi-Sarfaty, C. Understanding the Contribution of Synonymous Mutations to Human Disease. Nat. Rev. Genet. 2011, 12, 683–691.

(4)

Angov, E. Codon Usage: Nature’s Roadmap to Expression and Folding of Proteins. Biotechnol. J. 2011, 6, 650–659.

(5)

Gustafsson, C.; Govindarajan, S.; Minshull, J. Codon Bias and Heterologous Protein Expression. Trends Biotechnol. 2004, 22, 346–353.

(6)

Plotkin, J. B.; Kudla, G. Synonymous but Not the Same: The Causes and Consequences of Codon Bias. Nat. Rev. Genet. 2011, 12, 32–42.

(7)

Quax, T. E. F.; Claassens, N. J.; Söll, D.; van der Oost, J. Codon Bias as a Means to FineTune Gene Expression. Molecular Cell. 2015, 59, 149–161.

(8)

Spencer, P. S.; Siller, E.; Anderson, J. F.; Barral, J. M. Silent Substitutions Predictably Alter Translation Elongation Rates and Protein Folding Efficiencies. J. Mol. Biol. 2012, 422, 328–335.

(9)

Zhou, M.; Guo, J.; Cha, J.; Chae, M.; Chen, S.; Barral, J. M.; Sachs, M. S.; Liu, Y. NonOptimal Codon Usage Affects Expression, Structure and Function of Clock Protein FRQ. Nature 2013, 495, 111–115.

(10)

Sander, I. M.; Chaney, J. L.; Clark, P. L. Expanding Anfinsen’s Principle: Contributions of Synonymous Codon Selection to Rational Protein Design. J. Am. Chem. Soc. 2014, 136, 858-861.

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 25

(11)

Siller, E.; DeZwaan, D. C.; Anderson, J. F.; Freeman, B. C.; Barral, J. M. Slowing Bacterial Translation Speed Enhances Eukaryotic Protein Folding Efficiency. J. Mol. Biol. 2010, 396, 1310–1318.

(12)

Yu, C. H.; Dang, Y.; Zhou, Z.; Wu, C.; Zhao, F.; Sachs, M. S.; Liu, Y. Codon Usage Influences the Local Rate of Translation Elongation to Regulate Co-Translational Protein Folding. Mol. Cell 2015, 59, 744–754.

(13)

Hess, A. K.; Saffert, P.; Liebeton, K.; Ignatova, Z. Optimization of Translation Profiles Enhances Protein Expression and Solubility. PLoS One 2015, 10, e0127039.

(14)

Zhou, Z.; Schnake, P.; Xiao, L.; Lal, A. A. Enhanced Expression of a Recombinant Malaria Candidate Vaccine in Escherichia Coli by Codon Optimization. Protein Expr. Purif. 2004, 34, 87–94.

(15)

Nissley, D. A.; Sharma, A. K.; Ahmed, N.; Friedrich, U.; Kramer, G.; Bukau, B.; O’Brien, E. P. Accurate Prediction of Cellular Co-Translational Folding Indicates Proteins Can Switch from Post- to Co-Translational Folding. Nat. Commun. 2015, 7, 10341.

(16)

Ciryam, P.; Morimoto, R. I.; Vendruscolo, M.; Dobson, C. M.; O’Brien, E. P. In Vivo Translation Rates Can Substantially Delay the Cotranslational Folding of the Escherichia Coli Cytosolic Proteome. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, E132-140.

(17)

Nissley, D. A.; Obrien, E. P. Timing Is Everything: Unifying Codon Translation Rates and Nascent Proteome Behavior. J. Am. Chem. Soc. 2014, 136, 17892–17898.

(18)

Espah Borujeni, A.; Salis, H. M. Translation Initiation Is Controlled by RNA Folding Kinetics via a Ribosome Drafting Mechanism. J. Am. Chem. Soc. 2016, 138, 7016–7023.

(19)

Kolomeisky, A. B.; Schütz, G. M.; Kolomeisky, E. B.; Straley, J. P. Phase Diagram of One-Dimensional Driven Lattice Gases with Open Boundaries. J. Phys. A: Math Gen. 1999, 31, 6911–6919.

(20)

Dong, J.; Klumpp, S.; Zia, R. K. P. Entrainment and Unit Velocity: Surprises in an Accelerated Exclusion Process. Phys. Rev. Lett. 2012, 109, 130602.

(21)

Poker, G.; Margaliot, M.; Tuller, T. Sensitivity of mRNA Translation. Sci. Rep. 2015, 5, 12795.

(22)

Shaw, L. B.; Zia, R. K. P.; Lee, K. H. Totally Asymmetric Exclusion Process with Extended Objects: A Model for Protein Synthesis. 2003, 138, 02190.

(23)

Sharma, A. K.; Bukau, B.; O’Brien, E. P. Codon Positions That Strongly Influence Cotranslational Folding Are far from Equilibrium: A Framework for Controlling NascentProtein Folding. J. Am. Chem. Soc. 2016, 138, 1180-1195 .

(24)

Sharma, A. K.; Chowdhury, D. Distribution of Dwell Times of a Ribosome: Effects of Infidelity, Kinetic Proofreading and Ribosome Crowding. Phys. Biol. 2011, 8, 26005.

(25)

Basu, A.; Chowdhury, D. Traffic of Interacting Ribosomes : Effects of Single-Machine Mechanochemistry. Phys. Rev. E 2007, 75, 021902.

14

ACS Paragon Plus Environment

Page 15 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(26)

Fluitt, A.; Pienaar, E.; Viljoen, H. Ribosome Kinetics and Aa-tRNA Competition Determine Rate and Fidelity of Peptide Synthesis. Comput. Biol. Chem. 2007, 31, 335– 346.

(27)

Pai, A.; You, L. Optimal Tuning of Bacterial Sensing Potential. Mol. Syst. Biol. 2009, 5 286.

(28)

Ciandrini, L.; Stansfield, I.; Romano, M. C. Ribosome Traffic on mRNAs Maps to Gene Ontology: Genome-Wide Quantification of Translation Initiation Rates and Polysome Size Regulation. PLoS Comput. Biol. 2013, 9, e1002866.

(29)

De Sancho, D.; Muñoz, V. Integrated Prediction of Protein Folding and Unfolding Rates from Only Size and Structural Class. Phys. Chem. Chem. Phys. 2011, 13, 17030.

(30)

O’Brien, E. P.; Christodoulou, J.; Vendruscolo, M.; Dobson, C. M. New Scenarios of Protein Folding Can Occur on the Ribosome. J. Am. Chem. Soc. 2011, 133, 513–526.

(31)

Ingolia, N. T.; Ghaemmaghami, S.; Newman, J. R. S.; Weissman, J. S. Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling. Science 2009, 324, 218–223.

(32)

Zia, R. K. P.; Dong, J. J.; Schmittmann, B. Modeling Translation in Protein Synthesis with TASEP: A Tutorial and Recent Developments. J. Stat. Phys. 2011, 144, 405–428.

(33)

Shah, P.; Ding, Y.; Niemczyk, M.; Kudla, G.; Plotkin, J. B. Rate-Limiting Steps in Yeast Protein Translation. Cell 2013, 153, 1589–1601.

(34)

Prabakaran, S.; Lippens, G.; Steen, H.; Gunawardena, J. Post-Translational Modification: Nature’s Escape from Genetic Imprisonment and the Basis for Dynamic Information Encoding. Wiley Interdiscip. Rev. Syst. Biol. Med. 2012, 4, 565–583.

(35)

Volkmer, B.; Heinemann, M. Condition-Dependent Cell Volume and Concentration of Escherichia Coli to Facilitate Data Conversion for Systems Biology Modeling. PLoS One 2011, 6, e23126.

(36)

Chen, H.; Shiroguchi, K.; Ge, H.; Xie, X. S. Genome-Wide Study of mRNA Degradation and Transcript Elongation in Escherichia Coli. Mol. Syst. Biol. 2015, 11, 781.

(37)

Belle, A.; Tanay, A.; Bitincka, L.; Shamir, R.; O’Shea, E. K. Quantification of Protein Half-Lives in the Budding Yeast Proteome. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 13004–13009.

(38)

Powers, E. T.; Powers, D. L.; Gierasch, L. M. FoldEco: A Model for Proteostasis in E. Coli. Cell Rep. 2012, 1, 265–276.

(39)

Sabi, R.; Tuller, T. A Comparative Genomics Study on the Effect of Individual Amino Acids on Ribosome Stalling. BMC Genomics 2015, 16 Suppl 1, S5.

(40)

Charneski, C.; Hurst, L. Positively Charged Residues Are the Primary Determinants of Ribosomal Velocity. PLoS Biol. 2013 11, e1001508.

(41)

Lu, J.; Deutsch, C. Electrostatics in the Ribosomal Tunnel Modulate Chain Elongation 15

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 25

Rates. J. Mol. Biol. 2008, 384, 73–86. (42)

Qu, X.; Wen, J.-D.; Lancaster, L.; Noller, H. F.; Bustamante, C.; Tinoco, I. The Ribosome Uses Two Active Mechanisms to Unwind Messenger RNA during Translation. Nature 2011, 475, 118–121.

(43)

Komar, A. A.; Lesnik, T.; Reiss, C. Synonimous Codon Substitution Affects Ribosome Traffic and Protein Folding during in Vitro Translation. FEBS Lett. 1999, 462, 387–391.

(44)

Hu, S.; Wang, M.; Cai, G.; He, M. Genetic Code-Guided Protein Synthesis and Folding in Escherichia Coli. J. Biol. Chem. 2013, 288, 30855–30861.

(45)

Gorgoni, B.; Ciandrini, L.; Mcfarland, M. R.; Romano, M. C.; Stansfield, I. Identification of the mRNA Targets of tRNA-Specific Regulation Using Genome-Wide Simulation of Translation. Nucleic Acids Res. 2016, 44, 9231-9244.

(46)

Kobayashi, T.; Takimura, T.; Sekine, R.; Vincent, K.; Kamata, K.; Sakamoto, K.; Nishimura, S.; Yokoyama, S. Structural Snapshots of the KMSKS Loop Rearrangement for Amino Acid Activation by Bacterial Tyrosyl-tRNA Synthetase. J. Mol. Biol. 2005, 346, 105–117.

(47)

Salis, H. M.; Mirsky, E. A.; Voigt, C. A. Automated Design of Synthetic Ribosome Binding Sites to Control Protein Expression. Nat. Biotechnol. 2009, 27, 946–950.

(48)

Poker, G.; Zarai, Y.; Margaliot, M.; Tuller, T. Maximizing Protein Translation Rate in the Nonhomogeneous Ribosome Flow Model: A Convex Optimization Approach. 2014, 11, 20140713.

(49)

Doerfel, L. K.; Wohlgemuth, I.; Kothe, C.; Peske, F.; Urlaub, H.; Rodnina, M. V. EF-P Is Essential for Rapid Synthesis of Proteins Containing Consecutive Proline Residues. Science 2013, 339, 85–88.

(50)

Pavlov, M. Y.; Watts, R. E.; Tan, Z.; Cornish, V. W.; Ehrenberg, M.; Forster, A. C. Slow Peptide Bond Formation by Proline and Other N-Alkylamino Acids in Translation. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 50–54.

(51)

Hoffmann, A.; Bukau, B.; Kramer, G. Structure and Function of the Molecular Chaperone Trigger Factor. Biochim. Biophys. Acta 2010, 1803, 650–661.

(52)

Tuller, T.; Zur, H. Multiple Roles of the Coding Sequence 5’ End in Gene Expression Regulation. Nucleic Acids Res. 2014, 43, 13–28.

(53)

Zur, H.; Tuller, T. Predictive Biophysical Modeling and Understanding of the Dynamics of mRNA Translation and Its Evolution. Nucleic Acids Res. 2016, 44, 9031–9049.

(54)

Raveh, A.; Margaliot, M.; Sontag, E. D.; Tuller, T. A Model for Competition for Ribosomes in the Cell. J. R. Soc. Interface 2015, 13, 20151062.

16

ACS Paragon Plus Environment

Page 17 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 1: Illustration of the co-translational and post-translational protein folding reaction scheme. A ribosome initiates translation with rate  when there is no other ribosome at the first six codon positions. Translation of codon position  occurs with rate (), provided there is no ribosome at  + 5 codon position downstream of the A site. Termination and release of the fully synthesized protein occurs with rate . A fully synthesized domain outside the exit tunnel can fold and unfold co-translationally with rate /V' () and /'V (), respectively, at codon position . Protein molecules synthesized from a number of mRNA sequences are released in the cytosol in the unfolded and folded state with rate V and Z , respectively. A protein molecule released from the ribosome can either be folded and functional (F), unfolded (U), aggregated (A) or degraded (D). Note well that state D is a sink, and hence proteins entering this state are eliminated from the total amount of protein in the cell.

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 25

Figure 2: The translation regimes within the homogenous g − hijkl model as a function of translation-initiation m and termination rate n, and an illustration of op and oq . (A) Low density, high density and maximal current phases are indicated, as well as the dividing lines between regimes. Mathematical expressions for the rate of protein production per transcript, , and ribosome density, , on the transcript are reported for each regime. (B) Depiction of the primary structure of a two-domain protein. ,- is the position of the most C-terminal residue of the domain of interest, in this case it is the N-terminal domain. (C) , is the number amino acid residues that can fit inside the ribosome exit tunnel.

18

ACS Paragon Plus Environment

Page 19 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 3: Folded structures of five E. coli proteins modeled in this study. The proteins are translation products of genes DHAS_ECOLI (PDB ID: 1T4D), E4PD_ECOLI (PDB ID: 2X5J), G3P1_ECOLI (PDB ID: 2VYN), SYY_ECOLI (PDB ID: 2YXN) and PGK_ECOLI (PDB ID: 1ZMR) and they are illustrated in (A), (B), (C), (D) and (E), respectively. The N-terminal domain of each protein is shown in red.

19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 25

Figure 4: The behavior of  with respect to r in the L.D. phase for the five E. coli proteins. Increasing the translation-elongation rate causes non-monotonic behavior in the production of functional nascent-protein (A), and non-monotonic behavior in  versus  (B). Note well, in (B) only  is being altered. Black, green, blue, red and pink lines correspond to proteins produced form genes DHAS_ECOLI, SYY_ECOLI, G3P1_ECOLI, E4PD_ECOLI and PGK_ECOLI, respectively. =0.083 @ &9 (Ref. 27) was used in these calculations.

20

ACS Paragon Plus Environment

Page 21 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 5: The behavior of  with respect to n in the H.D. phase for the E. coli proteins. Increasing the termination rate, , causes non-monotonic behavior in the rate of production of functional nascent protein (A), and non-monotonic behavior in  versus  (B). Note that only is altered in (B). Protein color coding is the same as in Fig. 4. =10.0 @ &9 was used in these calculations.

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 25

Figure 6: The behavior of  for the E. coli proteins with respect to average translationelongation rate r when the first three assumptions used in the derivation of Eq. 2 are violated. Increasing  results in non-monotonic behavior in  (A), and non-monotonic behavior in  versus  (B) in this L.D.-like regime. Note that only  is being varied in (B). Protein color coding is the same as in Fig. 4. =0.083 @ &9 (Ref. 27) was used in these numerical calculations.

22

ACS Paragon Plus Environment

Page 23 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 7: The behavior of  with respect to n in the H.D.-like phase for the E. coli proteins when the first three assumptions used in the derivation of Eq. 2 are violated. Increasing the termination rate results in a non-monotonic behavior in  (A), and a non-monotonic behavior in  versus  (B). Note that only is being varied in (B). Protein color coding is the same as in Fig. 4. =1.00 @ &9 was used in these numerical calculations.

23

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 25

Figure 8: The steady-state levels of the E. coli proteins exhibit non-monotonic behavior when r or n is increased. The steady-state levels of functional protein, [S]]] , as a function of the translation-elongation rate  in the L.D.-like regime (panels A, C and E) and as a function of the termination rate in the H.D.-like regime (panels B, D and F). Protein color coding is the same as in Fig. 4. (A) and (B) were calculated from Eqs. 3-5 by setting /stt = 0.5 NO &9 @ &9; (C) and (D) were calculated from Eqs. 3-5 by setting /stt = 0.05 NO&9 @ &9; and (E) and (F) were calculated from Eqs. 3-5 by setting /stt = 0.005 NO&9 @ &9. Note well that while [S]]] exhibits non-monotonic behavior in these cases, the rate of protein production  exhibits monotonically increasing behavior under these conditions.

24

ACS Paragon Plus Environment

Page 25 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

TOC Graphic

25

ACS Paragon Plus Environment