2094
Biochemistry 1988, 27, 2094-2102
Tullius, T. D., & Dombroski, B. A. (1986) Proc. Nutl. Acad. Sci. U.S.A. 83, 5469-5473. Vershon, A. K. (1986) Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA. Vershon, A. K., Youderian, P., Weiss, M. A., Susskind, M. M., & Sauer, R. T. (1985a) in Specificity in Transcription and Translation (Calendar, R., & Gold, L., Eds.) pp 209-218, Alan R. Liss, New York. Vershon, A. K., Youderian, P., Susskind, M. M., & Sauer, R. T. (1985b) J . Biol. Chem. 260, 12124-12129.
Vershon, A. K., Liao, S.-M., McClure, W. R., & Sauer, R. T. (1987a) J . Mol. Biol. 195, 311-322. Vershon, A. K., Liao, S.-M., McClure, W. R., & Sauer, R. T. (1987b) J . Mol. Biol. 195, 323-331. Vershon, A. K., Bowie, J. U., Karplus, T. M., & Sauer, R. T. ( 1 9 8 7 ~ )Proteins: Struct., Funct., Genet. I , 302-31 1. Youderian, P., Vershon, A. K., Bouvier, S . , Sauer, R. T., & Susskind, M. M. (1983) Cell (Cambridge, Muss.) 35, 777-783. Zagursky, R. J., & Berman, M. L. (1984) Gene 27, 183-191.
Chemical Kinetics of Induced Gene Expression: Activation of Transcription by Noncooperative Binding of Multiple Regulatory Moleculest H. Almagor*,* and K. Paigen Department of Genetics, University of California, Berkeley, California 94720 Received June 3, 1987; Revised Manuscript Received September 15, 1987
ABSTRACT: A chemical kinetics model is described for the regulation of gene expression by the progressive binding of regulatory molecules to specific binding sites on DNA. Chemical rate equations are formulated and solved for the accumulation of regulatory molecules on DNA, the change in the level of induced mRNA, and the change in the level of the encoded protein in the activated tissue. Some special cases are examined, including that of an activation threshold created by a requirement for the binding of a minimum number of regulatory molecules prior to gene activation. Experimental data for several hormone-activated genetic systems are analyzed in the frame of the proposed model, and kinetic parameters are predicted. The model accounts for a number of experimental characteristics of hormone-inducible genetic systems, including the existence of a lag in the time course of m R N A accumulation, the sigmoidal curve of induced m R N A kinetics, the effect of hormone on m R N A stabilization, and the induction parameters observed when hormone analogues are used. The model also provides a n explanation for the phenotypes of genetic variants with altered inducibility as changes in the molecular kinetic parameters of gene activity.
I n several disparate biological control systems that involve the binding of regulatory proteins to DNA, the outcome is modulated by the binding of more than one molecule of regulatory protein. Systems as distinct as the initiation of yeast genome replication at ARS sequences (Jaswinski, 1983), integration and excision of prophages (Nash, 198 l), and gene activation by steroid-receptor complexes (Payvar et al., 1982; Ringold, 1983; Yamamoto, 1985a) all appear to involve the binding of protein molecules at multiple DNA sequences. Control systems that operate by multiple binding sites, rather than a single site, offer the possibility of graduated regulation. This property may be especially pertinent in genetic regulatory systems, which have the peculiar characteristic that the properties of the entire cell are decided by the outcome at the one or two gene copies present. This is in contrast to metabolic regulatory systems, where the properties of a single cell derive from the average behavior of many molecules. The potential all-or-none character of the response of a gene will be largely mitigated if levels of gene expression, for example, the rate of transcription, are quantitatively dependent on the number of regulatory molecules bound to the gene. In this case the outcome can be modulated quantitatively rather than by an all-or-none response. Recent direct evidence that quantitative modulation can occur in this way is provided by +This work was supported by N I H Grant GM31656. $Present address: Physical Chemistry Department, Fritz Haber Research Center, The Hebrew University, Jerusalem 91904, Israel.
0006-2960/88/0427-2094$01.50/0
experiments which vary the number of regulatory elements in synthetic DNA constructions. Searle et al. (1985) have inserted variable numbers of the upstream regulatory element required for metal activation of the mouse metallothionein I gene into the nonresponsive promoter of the herpes simplex virus thymidine kinase gene. Little or no induction by zinc was observed with single insertions of the regulatory element, whereas many constructions with two copies of the regulatory element were inducible, and constructions with additional copies were even more inducible. Similar results were obtained by Toohey et al. (1986), who reported that levels of glucocorticoid activation of the mouse mammary tumor virus promoter were directly proportional to the number of glucocorticoid response elements upstream of the promoter sequence. To explore the characteristics of multiple-binding systems, we have considered the activation of gene transcription by steroid hormone receptor complexes, a subject that has been much studied experimentally and which has generated a considerable body of quantitative data (Ringold, 1983; Lanne et al., 1976; Janne & Bardin, 1984; Swaneck et al., 1979; McKnight & Palmiter, 1979; Watson et al., 1981, 1985; Perry et al., 1984; Shapiro & Brock, 1985; Karlsen et al., 1986). Using a chemical kinetics approach to analyzing the relationship between regulatory protein binding and gene expression, we have formulated the rate equations and their solutions for the major steps of the gene expression pathway. The equations developed provide a good description of several experimental systems and suggest the physical basis for some 0 1988 American Chemical Society
V O L . 2 7 , N O . 6, 1 9 8 8
KINETICS OF INDUCED GENE EXPRESSION
2095
(1) Gene Activation by Receptor Binding. The reaction between a receptor-hormone complex (r) and a single binding site (s) is described by
&/ P receptor
r+ses* (1) We assume that, under the conditions of hormone excess, reaction with the binding sites of gene G does not significantly alter the concentration of receptor-hormone complex. Because this concentration is constant, we can write
receptoron
G,
off
k+
s protein
Translation
amino acids
secretion
FIGURE 1: Schematic description of the three sequential stages in the expression of the gene G. (1) Activation of the gene by the progressive accumulation of regulatory molecules at specific sites on DNA. Accumulation results from a chain of reversible reactions leading to an equilibrium (eq 11). (2) Synthesis of mRNA R catalyzed by the activated gene and degradation of R. (c) Synthesis of protein P catalyzed by the mRNA R and disappearance of P. The mRNA and protein kinetics are described by two synthesis/loss cycles that lead to the steady-state solutions given by eq 16 and 19a, respectively.
of the characteristic features of gene activation by hormones. MATERIALS AND METHODS The experimental materials and methods used to gather the data analyzed are described in the cited references. Computer analysis was carried out on a VAX8600 (UNIX operating system) of the University of California, Berkeley, computer center. For the nonlinear optimization of kinetic parameters, we used the finite-difference-algorithm procedure ZXSSQ from the IMSL FORTRAN library. The program, FIT1, used for the curve fitting, is written in FORTRAN-77 and is available upon request. BASICMODEL Consider the simplest case of a homogeneous population of cells (a tissue) and a gene, G, which is transcribed in each cell to give a single mature mRNA, R, which is translated to give a single species of protein, P. Gene G is induced to higher rates of transcription by the attachment of receptor-hormone complex molecules to multiple specific binding sites on DNA. The receptor-free gene has a basal rate of transcription, and the transcription rate increases as additional receptor molecules bind to specific sites (Figure 1). Initially, we assume that receptor complexes are not cooperative in either their binding to DNA or their effect on transcription. As a result, the transcription rate increases in linear proportion to the number of receptor molecules bound. This is the simplest assumption that is in accord with the analyzed experimental data (see below), and it is also supported by recent results (Toohey et al., 1986) indicating that differential levels of hormone-inducible gene expression can be modulated in an additive way by the number of glucocorticoid-responsive enhancers associated with the mouse mammary tumor virus promoter. In the case considered here, the receptor-hormone complex affects the expression of gene G only by changing the transcription rate (Le., the rate of production of new mRNA molecules). Cases where receptor-hormone complexes affect other steps in the transcription-translation pathway (e.g., mRNA stability or translation rate) can be considered by extending the basic rate equations derived here.
k-
s*
where k+ and k- are the pseudo-first-order rate constant for binding and the first-order rate constant for detachment, respectively. (Both rate constants have the units of time-'.) Let N denote the number of specific receptor binding sites relevant for induction of gene G and let G, denote a gene with n (out of N) sites occupied by receptor molecules, assuming one receptor binds to one site. Since all sites are equivalent, there are N + 1 possible states of the gene (n = 0, 1, 2, ..., N). This generates the system of equations
Note that the rate constants k,+ and k; in eq 3 depend on n (since G, includes all combinations of n bound sites out of total N), and they will be further affected if there is any cooperativity (either negative or positive) between the sites. To calculate R ( t ) , the mean number of occupied sites per gene at time t , it is necessary to know X , ( t ) , the fraction of genes having n bound sites as a function of time (i.e., X , is the fraction of genes G which are in the state G,,). The appropriate system of rate equations follows from eq 3:
+ k1-XI = k,+X,,-l - (k,,- + k,+l+)X, + k,+l-X,+I dXo/dt = -kl+X,
dX,/dt
for 0 d X N / d f = kN'XN-1
< n < N (4)
- kN-xN
A knowledge of all the rate constants in this system of equations is required to obtain a general solution (Hammes & Schimmel, 1966, 1970; Cantor & Schimmel, 1980). However, the solution for the case of N equivalent and independent binding sites can be deduced from the solution for a one-site system (cf. eq l ) , for which we have dX,/dt = k+Xo - k-Xi (5) dXo/dt = -k+Xo + k-XI In the special case of N = 1 the mean number of bound sites, equal to the fraction of genes with a bound receptor, X l ( t ) . The general solution for X l ( t ) in eq 5 is i l ( t ) , is
xl(t)-
~
~
(
0
=0
[x,(o) - ~ ~ ( m ) ] e - ( ~ + + ~(6) -)'
)
+
where X l ( m ) = k + / ( k + k-) is the steady-state fraction of bound sites. Then, starting from a receptor-free gene, where X l ( 0 ) = 0, we have
X l ( t ) = ( k + / k o ) [ l- e-ko'1
(7a)
where we have defined kO k+ + k(7b) The binding relaxation constant, ko, describes the rate at which the binding approaches equilibrium; it is the rate of change of the quantity [ n ( m ) - n ( t ) ] . Equation 7a describes the probability as a function of time for a given site to be bound to a receptor molecule. Now we shall generalize for the case that each gene has not only one but rather N equivalent and independent binding sites
2096
B I o c H E M I S TR Y
ALMAGOR AND PAIGEN
(all sites have the same binding rate constant and the same detachment rate constant, and there is no correlation between the sites). In this case, the probability as a function of time for any particular site to be bound,f(t), is still given by eq 7a since the sites are independent of each other. We thus have f ( t ) 1 k + / k o [ l - e-kot1
(8)
Since the receptor complex molecules are randomly distributed among the binding sites, individual copies of gene G will vary in the number of occupied sites. The fraction of genes with exactly n bound sites can be calculated from the probability that a given site is bound,flt) and the probability that a given site is not bound, 1 - f ( t ) . This fraction is xn(t)=
[f ]
v(t)ln[l
-f(t)l~-n
12
la)
W
5 Y
bo =3 ao 2" b W
0 W
a 50
I
b)
a = 1.0 moleculelsitelday
(9)
< z a
where
():
E
N! n!(N - n)!
is the combinatorial factor. Equation 9 (the binomial distribution function) describes the time-dependent distribution of the occupation number, n; this equation is given in terms of the kinetic parameters of the reactions between the receptor-hormone complex and the specific binding site (cf. eq 8). The average occupancy at time t , ii(t), is related to x,,(t), the fraction of gene G copies that have exactly n bound sites, by
OVI
o a
= C nXn(t) n=O
= N.f(t) = N ( k + / k o ) [ l- e-kof1
=
k, k,
0.2 day-:
= 0.2 day-
20
(10)
(1 1 )
Equation 11, however, can be reached directly by noticing that since all sites are equivalent and independent, the mean number of bound sites per gene will be equal to the probability for a single site to be bound, f ( t ) , multiplied by the total number of sites per gene, N . These relationships show that the progressive binding of receptor molecules to N equivalent and independent sites is a first-order saturation process (eq 1 1 ) and that the saturation level of binding is proportional to the total number of binding sites, N , and to the binding fraction k + / k o = k + / ( k ++ k-). An example of such a binding curve is shown in Figure 2a. ( 2 ) Uninduced Transcription. In the uninduced state, gene G has no bound receptor-hormone complex molecules and a basal rate of transcription of ro [molecules/(cell~time)]. The mRNA produced undergoes turnover with a first-order degradation constant k, (in time-'). Under these circumstances the rate equation for mRNA R is dR(t)/dt = ro - k,R(t)
I
a
'Or 0 ' 0
Substituting x,,(t)from eq 9 into eq 10 and using standard algebra, we get ii(t)
12
c)
W I-
N
ii(t)
6
(12)
/
1
1
'
24
12
48
36
60
TIME (days)
Three kinetic curves representing the outcomes of the three steps described in Figure 1. The kinetic parameters as well as the time units were chosen arbitrarily. (a) Progressive activation of gene G as represented by n(t), the mean number of bound regulatory molecules per gene at time t (eq 11). (b) Accumulation of mRNA R molecules in the cell (eq 16); there is a short characteristic lag. (e) Accumulation of protein P (eq 19a); there is a longer lag here than in the mRNA kinetics. FIGURE2:
to i i ( t ) , the mean number of occupied binding sites per gene at time t (induction starts at t = 0). Initially, let us consider the case that the hormone does not affect mRNA stability or later steps in the pathway of gene expression; the rate equation describing the concentration of mRNA R in the induced tissue is then dR(t)/dt = ro - k,R(t) aii(f) (14)
+
where the constant a[molecules/(cell~day)]is the induction efficiency as measured by the increase in the rate of transcription due to binding of one receptor molecule. By eq 11 we have
Under steady-state conditions dR(t)/dt = 0 and R(basa1) = r o / k ,
(13)
Equation 13 describes the steady-state mRNA level in an uninduced cell with a currently active gene. ( 3 ) Induced Transcription. We now derive the rate equations for the simplest case of induced transcription where there are N equivalent and independent binding sites for receptorhormone complex molecules and when the rate of transcription of any given copy of gene G is proportional to the number of receptor complex molecules bound to it. The induced rate of transcription of gene G for the entire tissue is then proportional
At steady state (at time infinity) the exponential term in eq 15a vanishes and the concentration of mRNA R is constant and is given by R(m) =
ro aNk+ +kr
k,ko
Note that R(m) is the sum of R(O),the basal (uninduced) level of mRNA R, and aNk+/k,k0, the maximal increase in mRNA level in the fully induced tissue. The latter term is composed of several physically relevant parameters. The term k + / k o
describes the probability that a DNA binding site is occupied, and that term times N equals the number of occupied sites. The term a is the increase in transcription for each occupied site, so that a N k + / k o is the fully induced rate of mRNA synthesis. That term divided by the first-order rate constant for mRNA degradation sets the steady-state level of induced mRNA. Integration of eq 15a yields
Equation 18b describes protein level at steady state (dP(t)/dt = 0) in an uninduced cell with a currently expressed gene. These equations are equivalent to the familiar protein turnover kinetics [Le., eq 18a was previously described by several investigators (Price et al., 1962; Segal & Kim, 1963; Schimke et al., 1964)l. The rate equation for the induced level of protein P is generated by substituting eq 16 for R(t)in eq 17a. Integration then gives the level of protein P as a function of time
+
P ( t ) = P ( m ) - k,aNk+[Ae-kpf Bskrf
Equation 16 describes the time course of accumulation of mRNA. One interesting feature of eq 16 is that it predicts a lag period in the appearance of mRNA. For small values of t (where "small" means k,t < 1 and k"t < 1) one can show that R(t) is practically constant and R(smal1 t ) = R(0) = r o / k , A lag period is therefore predicted for the initial part of the mRNA induction curve. As a first approximation, the duration of this lag period is inversely proportional to the larger of the two constants ko and k,. As a result, if one of these constants is very large compared to the other, this lag may become too short to see experimentally. For larger values of t and when k, # ko, one of the two exponents in eq 16 approaches zero more rapidly, and eq 16 converges to a first-order saturation curve. The approach to steady state is dominated by the characteristic time constant of the slower cycle (Figure 1). When gene activation is rapid relative to mRNA turnover (ko > k,), the approach to steady state is largely described by the turnover rate of mRNA R. When gene activation is slow relative to mRNA turnover ( k o < k,), the approach to steady state is largely described by the kinetics of gene activation. An example of a curve described by eq 16 is shown in Figure 2b (the basic model). It should be noted that any first-order activation process other than the accumulation of receptor molecules on binding sites (for example, synthesis of an intermediate protein) would yield the same formal solution described by eq 16. However, the progressive binding model also provides a simple way to introduce a threshold into the activation of gene transcription (see below, section 5.2). ( 4 ) Protein Synthesis. Changes in the concentration of mRNA R result in changes in the level of protein P in the tissue. The rate equation for protein P is dP(t)/dt = k,R(t) - k$(t) (17a) where k, [molecules/(cell-day)] is the first-order (in R) protein synthesis rate constant and k, [molecules/(cell-day)] is the first-order (in P) protein loss rate constant ("loss" here includes all modes of protein disappearance, i.e., enzymatic degradation, secretion, etc.). In the uninduced system we have R = ro/k,, and eq 17a is explicitly
[ 21
dP(t)/dt = k , -
- kpP(t)
17b)
and is solved by P(t) = P(m) [ 1 - e-kpf]
2097
V O L . 2 1 , N O . 6, 1 9 8 8
KINETICS OF INDUCED GENE EXPRESSION
18a)
where 18b)
+ Ce-kot]
(19a)
where the amplitudes are given by
+i[ 1 -
A = - 1 kOk,kp ko - k,
kr(kr - kp)
1 ko(ko- k p )
1
(19b)
B=
1
k,(kr - ko)(kr - kp)
( 19c)
and P ( m ) is
The first term in eq 19d is the basal protein level, and the second term is the maximal increase in protein level following induction. An example of the kinetics of the induced protein accumulation is shown in Figure 2c. The inflection of this curve is more abrupt than that of the mRNA curve, and the lag period is longer. Equation 19a reflects the scheme of three chained catalytic cycles shown in Figure 1; each of the cycles (binding/detachment; transcription/disappearance; and translation/degradation) is represented by a distinct exponential term with the appropriate rate constant in the exponent. As in every chain reaction, the overall kinetics (eq 19a) will be dominated by the slowest intermediate step. If one of the three rate constants (k",k,, or kp) is much larger than the other two, then the exponential term with that rate constant will die out early and the overall kinetics will be dominated by the two slower cycles. For example, if ko is large, then binding to DNA is saturated very rapidly, and the kinetics of protein P accumulation is dominated by the turnover of mRNA (k,) and the turnover of protein P ( k p ) .If, in addition, mRNA R undergoes rapid turnover ( k , is much larger than k p ) ,then the steadystate level of mRNA R will also be reached early in the induction process and the kinetics of the accumulation of protein P will be dominated by the rate of protein turnover ( k p ) . In that case eq 19a will be reduced to eq 18a, with P ( m ) given by eq 19e. On the other hand, if protein turnover is much faster than both receptor accumulation on DNA and mRNA turnover ( k pis much larger than both ko and k,), then eq 19a is reduced to a constant multiple of eq 16. ( 5 ) Some Special Cases. ( 5 . 1 ) Multiple Cell Types. All organs contain multiple cell types, and it is often the case that only one cell type is actually hormone responsive. The population of unresponsive cells will have a constant mRNA level (as described by eq 13) and a constant protein level (as described by eq 18b). The population of responding cells will have a mRNA level following eq 16 and a protein level following eq 19a. It is obvious that the mean mRNA level in
2098
B I OcH E M I sTR Y
ALMAGOR AND PAIGEN
the tissue will follow the time course described by eq 16, where the apparent a will be smaller than the real induction efficiency (of the inducible cells) and will be equal to the real efficiency multiplied by the proportion of inducible cells in the tissue. The average protein level will follow the same equation as eq 19a but with the same correction for a as above. (5.2) Minimum Binding Requirement. Genes responsive to steroid hormones often exhibit a lag period for induction (Janne & Bardin, 1984; Watson et al., 1981, 1985; Brock & Shapiro, 1983a,b; Palmiter et al., 1976, 1981; Watson & Paigen, 1987). A short lag in mRNA accumulation is predicted by the basic model; however, in some cases the length of the lag period is excessive. Among the possible causes for an extended lag period is the requirement that a gene binds a minimum number, m , of initial receptor-hormone complex molecules before transcription can increase (Watson et al., 1981; Palmiter et al., 1981). This is an interesting case to consider since it would act to suppress biological noise, limiting gene response until a sufficiently strong hormonal signal appears. There are two distinctly different ways to incorporate a minimum binding requirement into the basic rate equations. In a threshold model, the first m receptor molecules bound to the gene do not directly increase the rate of transcription. As additional receptor molecules bind, the rate of transcription increases in proportion to the number of additional molecules bound. In this model a gene that does not bind more than m receptor molecules will be transcribed at the basal (uninduced) rate, whereas the transcription rate of any gene with more than m bound complex molecules will be proportional to n ( t ) - m , the number of bound molecules in excess of the minimum number m at time t . Since only genes with n(t) > m have increased transcription rate, the rate equation of mRNA R for the threshold model becomes dR(t)/dt = ro - k,R(t) + CY = [ro - k , R ( t )
N
2 X,(t)(n - m ) n=m+l N
+ a nC= OX n ( t ) ( n- m ) ] m
a C X n ( t ) ( n- m ) n=O m
= [ro- k,R
+ a[A(t)- m ] ] - a C X n ( t ) ( n- m ) n=O
where &(t) is the fraction of genes with n bound sites at time t , as given by eq 9. In the alternative gate model, once the number of bound receptor molecules exceeds m , every bound receptor molecule affects the rate of transcription equally. In this case a gene with more than m bound receptor molecules will be transcribed at a rate proportional to the total number of bound receptor molecules, n. The rate equation for mRNA R in the tissue will be dR(t)/dt = ro - k,R
N
+ an=m+l X n ( t ) n m
= [ro- k,R
+ aii(t)] - a n=O cXn(t)n
(21a)
Since binding of receptor molecules to sites is random among the many copies of the gene in the tissue, different copies of the gene will vary in their lag periods according to the time it takes each gene to reach m . The apparent lag period, tl, for the tissue can be defined as the average time it takes for a gene to bind m receptor molecules. Evaluating the lag period defined in this way and obtaining explicit solutions of eq 20a and 21a require a knowledge of N and m , which will rarely
be available. Alternatively, we can define tl by the time required for the tissue to bind an average number of m receptor molecules per gene, i.e. ii(tl) = m
These alternative definitions of tl become equivalent when N , the total number of binding sites per gene, is large. This latter definition of the lag period can be used to reformulate the minimum binding problem. If induction in the tissue only begins after the mean number of occupied sites per gene has reached a minimum value, m, the transcription rate does not increase before time tlr and the level of mRNA R in the time interval 0 < t < tl is constant and equals the basal (uninduced) level. After the lag period, tl, the rate of transcription of mRNA R in the tissue will be proportional to either A ( t ) - m (in the threshold model) or to fi(t) (in the gate model). Instead of eq 20a and 21a we then have the following rate equations for mRNA R: dR(t)/dt = ro - k,R
+ cu(ii(t) - m )
dR(t)/dt = ro - k,R
+ aii(t)
threshold model (20b) gate model (21b)
The solutions of eq 20b and 21b are formally equivalent to eq 16 with the boundary condition R(tJ = R(0). The level of mRNA R in the tissue is therefore given by for 0 < t < t , R(t) = r o / k , = R(m) - (Ae-kr' - Be-kot)
for t
> tl (22)
where tl is the lag period and A , B, and R ( m ) are constants that depend on the kinetic parameters and on m. [The constants can be evaluated explicitly by using the boundary condition R(tl) = R(O).] (5.3)Effects of Hormone on mRNA Stability. In several cases receptor-hormone complexes have been shown to act by stabilizing mRNA (Shapiro & Brock, 1985; Karlsen et al., 1986; Brock & Shapiro, 1983a; Berger et al., 1986). If maximum stabilization is achieved rapidly relative to gene activation, then eq 18a still applies, except that k, is now the (smaller) first-order rate constant for mRNA degradation in the presence of hormone. If mRNA stabilization is progressive, it is obvious that the time course of mRNA accumulation will be modified by the kinetics of the stabilization process. If rapid stabilization of mRNA is the only effect of hormone administration and there is no change in transcription rate, then eq 12 applies, and k, in that equation represents the rate of mRNA degradation in the presence of hormone.
EXPERIMENTAL DATA Measurements of in vivo transcription rates and mRNA levels as functions of time after hormone administration have been reported for several inducible gene systems (see below). Unfortunately, in vivo kinetic measurements of receptor binding to chromatin are rare (Palmiter et al., 1981) and so far limited to nonspecific binding; to the best of our knowledge no one has yet reported in vivo kinetic measurements of receptor binding to specific DNA sites in an induction system. (1) Alcohol Dehydrogenase (ADH) in Mouse Kidney Cells. Data that can be used to test the kinetic equations are available for the androgen induction of ADH. Transcription of the ADH gene increases in mouse kidney cells after androgen administration,' and Watson et al.* have measured ADH
' G . Watson and K . Paigen, unpublished data.
KINETICS OF INDUCED GENE EXPRESSION I
200,
t
00
k,
= 0
R(m):
13 hour-'
30.5
a 60
E 40
t
n
n V
0 ' ;
'
16 I
'
32 I
'
'
'
46
'
64 I
.
'
80
200 0
ka
I
0 0 4 4 hour-'
k , = 0 13 hour-' R(m)= 169 5
,
0 0
0
50
100
20
10
150
30
200
,I 250
TIME (hours) FIGURE 3: Kinetics of ADH mRNA in androgen-induced and withdrawn mouse kidney cells. mRNA levels are expressed in micrograms of ADH mRNA per gram of total RNA. (a) ADH mRNA as a function of time after withdrawal of hormone from fully level (0) induced animals. The fitted curve (solid line) corresponds to the indicated value for k,, the first-order rate constant for mRNA degradation, as estimated from data after t = 7 h. (b) ADH mRNA levels (0)as a function of time after hormone administration. The curve was generated by using eq 16, with the value for k, estimated from deinduction data, and the indicated optimized values for the binding relaxation constant, ko, and the mRNA steady-state level,
Nm).
mRNA levels as a function of time after androgen stimulation and again subsequently after androgen withdrawal. A value of 0.13 h-' for k,, the rate constant of mRNA degradation, can be estimated from the deinduction phase of their data (Figure 3a). The induction data (Figure 3b) are well fitted by the curve generated by eq 16, using the value of 0.13 h-' for k, and an optimized value of 0.044 day-' for ko. It is interesting that the fit of the theoretical curve to the experimental data includes the early phase of induction (where there is a lag period) as shown by the enlarged part of the curve in Figure 3b. (2) Conalbumin and Ovalbumin in Chick Oviduct. Palmiter et al. (1981) measured the time course of binding of estrogen-activated receptors to chromatin in chick oviduct, as well as rates of conalbumin and ovalbumin mRNA synthesis and levels of both mRNAs. They found that the transcription rate of the conalbumin gene was proportional to the number of chromatin-bound receptor molecules per nucleus. In the framework of the proposed model (eq 16) this would mean that the number of receptor molecules bound to conalbumin inducible sites is a constant fraction of the total number of chromatin-bound receptor molecules and that k:,,, the relaxation constant for accumulation of receptor molecules on conalbumin inducible sites, is equal to the average relaxation constant for the accumulation of receptor molecules to total chromatin. The analysis of conalbumin data from Figure 1 in Palmiter et al. (1981), using eq 8, is shown in Figure 4. Receptor binding (Figure 4a) does appear to follow first-order G . Watson, J. D. Ceci, M. P. O'Malley, and M. Felder, unpublished data.
VOL. 27, NO. 6 , 1988
2099
kinetics with ko = 0.07 h-l, and the conalbumin transcription rate is proportional to the total number of receptor molecules bound to chromatin (Figure 4b). In contrast to conalbumin, the induced transcription rate of the ovalbumin gene was not proportional to the total number of receptor molecules bound to chromatin. The direction of the discrepancy indicates that the relaxation constant for the accumulation of receptor molecules at the ovalbumin inducible sites, kzv, is smaller than k:o,,. However, k:" can be inferred from the data of induced rate of transcription of the ovalbumin gene [Figure 1A in Palmiter et al. (198 l)] by assuming that transcription is proportional to the number of receptor molecules bound to inducible sites (eq 11). Analysis of the ovalbumin data is shown in Figure 4c. Palmiter et al. (1976) showed that there is a significant lag of about 3 h in the estrogen induction of ovalbumin mRNA, while no lag is seen in the induction of conalbumin mRNA. We therefore used the basic model (eq 16) to describe the data for conalbumin (Figure 4e) and the threshold model (eq 22) to describe the data for ovalbumin (Figure 40. Using the value of 0.017 h-l for k$,, we optimized a value of 0.033 h-l for the rate of degradation of conalbumin mRNA, k y . This value is lower than the value of 0.20 h-' estimated from the deinduction data (Figure 4d), suggesting that estrogens also act to stabilize conalbumin mRNA. The data for ovalbumin induction are well accounted for by the threshold model (Figure 4f). We have used the experimental value of 3 h for the lag time ( t J , the value of 0.16 day-' for k y , as estimated from the deinduction data (Figure 4d), and the value of 0.017 day-l for kzv, as estimated from the data of ovalbumin rate of transcription (Figure 4c). Palmiter et al. (1976) hypothesized a receptor translocation process as a possible explanation for the lag. We suggest the possibility of a threshold based on a requirement for a minimum number of bound receptor molecules. A related idea was discussed by Palmiter et al. (1981). (3) P-Glucuronidasein Mouse Kidney Cells. The androgen induction of the @-glucuronidase(Gus) gene in mice has been extensively studied by several workers (Janne & Bardin, 1984; Watson et al., 1981, 1985; Berger et al., 1986; Pfister et al., 1984; Bullock et al., 1985), and kinetic data describing changes of P-glucuronidase mRNA (GUS-mRNA) during induction have been reported for several haplotypes of the Gus gene (Watson & Paigen, 1987). These induction curves strongly resemble the curves predicted by eq 16 except that the lag periods are characteristically longer than those predicted by the basic model (eq 16). However, these data give an excellent fit to the curve predicted by the threshold model (cf. eq 22), which assumes that the first few ( m ) receptor molecules that bind do not increase the rate of transcription and that once this threshold is passed, the transcription rate is proportional to the number of additional bound receptor molecules. The data for GUS-mRNA levels in androgen-induced kidney cells of strain C57BL/6J mice as a function of time after hormone administration are shown in Figure 5 . Also shown are the best fit curves generated by the basic model and the threshold model. In fitting these curves, the value used for k, was the experimentally determined first-order rate constant for the disappearance of GUS-mRNA after androgen withdrawal [Table I in Watson and Paigen (1987)]. The values for basal (uninduced) mRNA and saturation (fully induced) mRNA levels were directly estimated from the data. A computer algorithm was then used to optimize the relaxation constant ko (in both models) and the lag time tl (in the threshold model). The curve predicted by the gate model (in
2100
BIOCHEMISTRY
ALMAGOR A N D PAIGEN
6.0I
4.0
I
Pd)
v,
1
3.2
4.8
0
k y = 0.20 hour-' ( 0 ) ky = 0.16 hour-' ( 0 )
J
2
1
L 1.6
E
w
1
0.8
2 0.0 ' 0
24
48 ,
72
96
0 _ 0~_
120 2
L
I
0
J 60
UJ
z
-
8
16
24
32
40
f)
J 48-
which the transcription rate after the minimum number of receptor molecules is bound is proportional to the total number of bound receptor molecules) converges very fast to the curve of the basic model and does not describe the early phase of induction appreciably better than the basic model (curve not shown). By use of eq 20a, the duration of the lag period provides a lower bound estimate for N , the number of specific binding sites. Early in induction (during the lag period), most of the receptor binding sites are vacant and receptor molecules are attaching to binding sites. The average number of bound sites,
small, and the absolute rate of detachment is low. As a first-order approximation, binding is proceeding at a linear rate. The binding rate per site is k+, and the binding rate per gene is Nk'. Since the lag time, tl,is the time required for binding the threshold number, m, per gene, then we have fi(t), is still
m
n(tJ = (Nk+)tl
(23a)
and hence t, = m/(Nk+)
(23b)
That is, the duration of the lag period is proportional to the
KINETICS OF INDUCED GENE EXPRESSION
VOL. 27, NO. 6, 1988
6.0
0
6
12
18
24
30
TIME (days)
Androgen induction of the @-glucuronidase gene in C57BL/6J mouse kidney cells. mRNA levels (micrograms of @glucuronidase mRNA per gram of total kidney RNA) are expressed as a function of time after hormone administration (0). The data were adapted from Figure 2 in Watson and Paigen (1987). We have applied the threshold model (eq 22) to generate a best-fit curve (solid line), using the experimental value k, = 1.1 day-' for the rate of mRNA degradation [taken from Table I in Watson and Paigen (1987)J and the followin optimal values: lag time, tl = 1.Oday; binding relaxation constant, k % = 0.13 day-'; steady-state mRNA level, R(m) = 5.5. Applying the basic model (eq 16) to these data generated a different optimal curve (dotted line) when the experimental value for k, and the optimal parameters ko = 0.1 1 day-' and R ( m ) = 5.4 were used. The basic model fails to account for the early phase of the induction curve. FIGURE 5 :
threshold number and inversely proportional to the binding rate constant and the total number of binding sites. Equation 23b can be rearranged to yield
N / m = l/(k+tl)
(23c)
Since k+ = ko - k-, it must be that k+ Iko and therefore
N / m 1 l/(k"tl) For the data shown in Figure 5 we have N L (0.13 X l.)-' = 8
(24)
Since m is at least 1, a lower bound of 8 is estimated for the number of receptor-specific binding sites in the region of the B haplotype of the Gus gene that is present in C57BL/6J strain mice. If m is greater than 1, the estimated value for the number of receptor binding sites will be proportionally higher. The equivalent calculation using the data for the A haplotype present in strain A / J mice led to a lower bound estimate of 20 specific binding sites. These lower bounds are consistent with the estimate of at least seven estrogen receptor binding regions in the hormone response region of mouse mammary tumor virus (Yamamoto, 1985b). We have limited our examples here to mRNA data. Protein data can be analyzed equivalently by using eq 19a.
DISCUSSION We have formulated and solved a chemical kinetic mechanism for the activation of gene transcription by multiple binding of regulatory molecules. Equations were generated for noncooperative binding occurring at equivalent and independent sites and by assuming a linear dependence of the induced rate of transcription on the number of regulatory molecules bound. This is in accord with recent results showing that the presence of multiple glucocorticoid-responsive enhancer sequences can differentially activate the mouse mammary tumor virus promoter in an additive fashion and that this increased hormone-responsive activity is relatively independent of the position of the enhancer sequences with respect to the promoter (Toohey et al., 1986). The equations de-
2101
veloped here give a good fit to the experimental data on steroid induction of alcohol dehydrogenase in mice and estrogen induction of conalbumin in chick oviduct. Moreover, several characteristic features of the data, including the sigmoidal kinetics of mRNA accumulation and the different kinetics of transcription activation of the ovalbumin and conalbumin genes, are accounted for by the kinetic parameters. The published experimental data for testosterone induction of GUS-mRNA in mouse kidney are not readily accounted for by the basic model because of a very long lag period before mRNA levels begin to increase. The Gus data are well predicted by the model if a threshold in receptor binding is introduced. Under the threshold model, several receptor molecules must bind to the gene before induced transcription is activated, and the subsequent rate of transcription is proportional to the number of additional receptor molecules bound to the gene. The threshold model also accounts well for the data of ovalbumin induction by estrogen in chick oviduct when kinetic parameters estimated from the experimental data are used. This is in accord with the observation (Palmiter et al., 1976) of a lag period in ovalbumin, but not conalbumin, induction. It is also in accord with the results of Searle et al. (1985) that more than one copy of the metallothionein metal response element sequence is required to make a gene inducible by metals. An interesting feature of the threshold model is that it provides a ready explanation for the two classes of genetic variants with altered inducibility of Gus (Watson et al., 1981; Watson & Paigen, 1987; Pfister et al., 1984). One class of variants has a shortened lag period, an increased initial rate of induced mRNA accumulation, and a high steady-state level of induced mRNA but does not show a change in the time required to reach the final induced mRNA level. The other class of variants is not changed in either the duration of the lag period or the initial rate of induced mRNA accumulation but does show a reduced final (steady-state) level of induced mRNA and reaches the plateau in a shorter time. Variants of the first class have the phenotype expected from mutations that change the total number of receptor binding sites, N , or the rate of attachment of receptor molecules to their binding sites, k+. Variants belonging to the second class have the phenotype expected from mutations that change the rate constant for dissociation of receptor molecules from their binding sites, k-, and consequently change the kinetic parameter ko which equals k+ + k-. The threshold model also accounts for the response when animals are induced with the androgen analogue medroxyprogesterone acetate. This results in a longer lag period, a slower rate of mRNA accumulation, and a reduced steadystate level of induced mRNA (Bullock et al., 1985). Medroxyprogesterone acetate has been shown to act as a weak allosteric effector (Bardin et al., 1978; Bullock et al., 1978) and thus to give a reduced concentration of active androgen receptor molecules at hormone saturation. This would result in a slower rate for attachment of receptor molecules to DNA binding sites, which is described by the kinetic parameter k+. The consequences predicted from eq 16 for a reduction in k+ are the same as those seen experimentally. The kinetic model also accounts for the observation that although testosterone induction of kidney epithelial cells is synchronous and all responsive cells begin to induce at the end of the lag period, induction is progressive within each cell thereafter (Paigen & Jakubowski, 1982). Previous efforts to model the kinetics of mRNA accumulation for the induction of Gus (Watson et al., 1981; Pfister
2102
BIOCHEMISTRY
et al., 1984) used an equation equivalent to the integral of eq 12. As would be expected, that equation provides a good description of the data for the later time period of induction when one of the two exponential terms in eq 16 has become very small but not for the earlier phases of induction. It should not unless the mechanism of induction is entirely by mRNA stabilization (see section 5.3) or ko >> k , and the second exponential term in eq 16 can be neglected. This latter case describes activation of the gene as a virtually instantaneous switching from a low uninduced rate of transcription to a high induced rate of transcription. The kinetic equations were derived by assuming multiple binding of regulatory molecules to the gene. The case of a single binding site is simply a special case of the general rate equation, with N = 1. However, if there is a single binding site, then binding of receptor molecules to the gene is not progressive, and there is no place for a threshold model with a minimum number of sites bound prior to activation of transcription. The kinetic model also describes the time course of mRNA accumulation if the mechanism of induction involves mRNA stabilization as well as the activation of transcription. Because the final outcomes of many regulatory processes are changes in the concentration of specific proteins, we have extended the kinetic treatment to include the synthesis and turnover of protein in the induced system. The final system of equations (eq 19a-e) describes the behavior of three chained catalytic cycles: activation and deactivation of DNA transcription, synthesis and degradation of mRNA, and translation of mRNA into protein with protein loss by turnover and secretion. As a necessary first approximation, we have omitted from the kinetic equations some of the complexities that might pertain in specific gene systems. These possibilities include cooperativity in regulatory molecules binding to DNA, a nonlinear relationship between binding of regulatory molecules and activation of transcription, binding of more than one species of regulatory molecules (for example, multiple hormone effects), progressive rather than immediate stabilization of mRNA by regulatory molecules, and intervention of regulatory molecules at other steps in the pathway of gene expression. ACKNOWLEDGMENTS We are very grateful to G. Watson, J. D. Ceci, M. P. 0’Malley, and M. Felder for making their data available for our use prior to publication. Registry No. ADH, 903 1-72-5; @-glucuronidase, 9001-45-0.
REFERENCES Barden, C. W., Brown, T. R., Mills, N. C., Gupta, C., & Bullock, L. P. (1978) Biol. Reprod. 18, 74-83. Berger, F. G., Loose, D., Meisner, H., & Watson, G. (1986) Biochemistry 25, 1170-1 175. Brock, M. L., & Shapiro, D. J. (1983a) J . Biol. Chem. 258, 5449-5455. Brock, M. L., & Shapiro, D. J. (1983b) Cell (Cambridge, Mass.) 34, 207-214. Bullock, L. P., Bardin, C. W., & Sherman, M. R. (1978) Endocrinology (Baltimore) 103, 1768-1782. Bullock, L. P., Watson, G., & Paigen, K. (1985) Mol. Cell. Biol. 41, 179-185.
ALMAGOR AND PAIGEN
Cantor, C. R., & Schimmel, P. R. (1980) Biophys. Chem. 3, 890-892. Hammes, G. G., & Schimmel, P. R. (1966) J . Phys. Chem. 70, 2319-2324. Hammes, G. G., & Schimmel, P. R. (1970) Enzymes (3rd E d . ) , 67-1 14. Janne, O., & Bardin, C. W. (1984) Pharmacol. Rev. 36, 35s-42s. Janne, O., Bullock, L. P., Bardin, C. W., & Jacob, S. T. (1976) Biochim. Biophys. Acta 418, 330-343. Jaswinski, S. M. (1983) J. Biol. Chem. 258, 2754-2757. Karlsen, K., Vallerga, J. H., & Firestone, G. L. (1986) Mol. Cell. Biol. 6, 574-5 85. McKnight, G. S . , & Palmiter, R. D. (1979) J . Biol. Chem. 254, 9050-9058. Nash, H. A. (1981) Annu. Rev. Genet. 15, 143-167. Paigen, K., & Jakubowski, A. F. (1982) Biochem. Genet. 20, 875-88 1. Palmiter, R. D., Moore, P. B., & Mulvihill, E. R. (1976) Cell (Cambridge, Mass.) 8, 557-572. Palmiter, R. D., Mulvihill, E. R., Shepherd, J. H., & McKnight, G. S. (1981) J. Biol. Chem. 256, 7910-7916. Payvar, F., Firestone, G. L., Ross, S. R., Chandler, V. L., Wrange, O., Carlstedt-Duke, J., Gustafsson, J.-Ake, & Yamamoto, K. R. (1982) J. Cell. Biochem. 19, 241-247. Perry, S. T., Viskochil, D. H., Ho, K.-C., Wilson, E. M., & French, F. S . (1984) in Regulation of Androgen Action (Bruchovsky, N., Chapdelaine, A., & Neumann, F., Eds.) pp 167-1 73, Congressdruck R. Bruckner, West Berlin. Pfister, K., Watson, G., Chapman, V., & Paigen, K. (1984) J. Biol. Chem. 259, 5816-5820. Price, V. E., Sterling, W. H., Tarentola, V. A,, Hartley, R. W., Jr., & Rechcigl, M., Jr. (1962) J . Biol. Chem. 237, 3468-3475. Ringold, G. M. (1983) Curr. Top. Microbiol. Immunol. 106, 79-103. Schimke, R. T., Sweeney, E. W., & Berlin, C. M. (1964) Biochem. Biophys. Res. Commun. 15, 214-219. Searle, P. F., Stuart, G. W., & Palmiter, R. D. (1985) Mol. Cell. Biol. 5 , 1480-1489. Segal, H. L., & Kim, Y . S . (1963) Proc. Nutl. Acad. Sci. U.S.A.50, 912-918. Shapiro, D. J., & Brock, M. L. (1985) Biochem. Actions Horm. 12, 139-172. Swaneck, G. E., Nordstorm, J. L., Kreuzaler, F., Tsai, M.-J., & O’Malley, B. W. (1979) Proc. Natl. Acad. Sci. U.S.A. 76, 1049-1053. Toohey, M. G., Morley, K. L., & Peterson, D. 0. (1986) Mol. Cell. Biol. 6, 4526-4538. Watson, G., & Paigen, K. (1987) Mol. Cell. Biol. 7 , 1085-1 090. Watson, G., Davey, R. A., Labarca, C., & Paigen, K. (1981) J . Biol. Chem. 256, 3005-301 1. Watson, G., Felder, M., Rabinow, L., Moore, K., Labarca, C., Tietze, C., Vander Molen, G., Bracey, L., Brabant, M., Cai, J., & Paigen, K. (1985) Gene 36, 15-25. Yamamoto, K. R. (1985a) in Molecular Developmental Biology (Bogorad, L., Ed.) pp 131-148, Alan R. Liss, New York. Yamamoto, K. R. (1985b) in Transfer and Expression of Eukaryotic Genes (Ginsberg, H. S . , Harold, S., & Vogel, H. J., Eds.) pp 79-92, Academic, Orlando, FL.