Human ethanol-inducible P450IIE1: complete gene sequence

Structure and in vitro transcription of the rat CYP2A1 and CYP2A2 genes and regional localization of the CYP2A gene subfamily on mouse chromosome 7...
0 downloads 0 Views 2MB Size
Biochemistry 1988, 27. 9006-901 3

9006

Biggin, M. D., Gibson, T. J., & Hong, G. F. (1983) Proc. Natl. Acad. Sci. U.S.A.80, 3963-3965. Birnstiel, M. L., Busslinger, M., & Strub, K. (1985) Cell (Cambridge, Mass.) 41, 349-359. Chatterjee, B., & Roy, A. K. (1980) J . Biol. Chem. 255, 11607-1 1613. Chatterjee, B., Demyan, W. F., Lalwani, N. D., Reddy, J. K., & Roy, A. K. (1983) Biochem. J. 214, 879-883. Chatterjee, B., Majumdar, D., Ozbilen, O., Murty, C. V. R., & Roy, A. K. (1987a) J . Biol. Chem. 262, 822-825. Chatterjee, B., Murty, C. V. R., Olson, M. J., & Roy, A. K. (1987b) Eur. J . Biochem. 166, 273-278. Deininger, P. L. (1983) Anal. Biochem. 129, 216-223. Farrell, S. O., & Bieber, L. L. (1983) Arch. Biochem. Biophys. 222, 123-132. Farrell, S. O., Fiol, C. J., Reddy, J. K., & Bieber, L. L. (1984) J . Biol. Chem. 259, 13089-13095. Feinberg, A. P., & Vogelstein, B. (1983) Anal. Biochem. 132, 6-13. Fujiki, Y., Rachubinski, R. A,, Dehesa, A. Z . , & Lazarow, P. B. (1986) J . Biol. Chem. 261, 15787-15793. Furuta, S., Hayashi, H., Hijikata, M., Miyazawa, S., Osumi, T., & Hashimoto, T. (1986) Proc. Natl. Acad. Sci. U.S.A. 83, 313-317. Hashimoto, T. (1987) in Peroxisomes in Biologv and Medicine (Fahimi, H. D., & Sies, H., Eds.) pp 97-104, SpringerVerlag, New York. Kozak, M. (1984) Nucleic Acids Res. 12, 857-872. Lazarow, P. B., & Fujiki, Y. (1985) Annu. Rev. Cell Biol. 1, 489-530.

Maniatis, T., Fritsch, E. T., & Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY. Markwell, M. A. K., McGaroarly, E. J., Bieber, L. L., & Tolbert, N. E. (1973) J . Biol. Chem. 248, 3426-3432. Miyazawa, S., Ozasa, H., Osumi, T., & Hashimoto, T. (1983) J . Biochem. (Tokyo) 94, 529-542. Osumi, T., Ozasa, H., & Hashimoto, T. (1984) J. Biol. Chem. 259, 2031-2034. Osumi, T., Ishi, N., Hijiakta, M., Kamijo, K., Ozasa, H., Furutua, S., Mayazawa, S., Kondo, K., Inone, K., Kagamiyama, H., & Hashimoto, T. (1985) J. Biol. Chem. 260, 8905-8910. Proudfoot, N. J., & Brownlee, G. G. (1976) Nature (London) 263, 21 1-214. Rachubinski, R. A., Fijiki, Y., & Lazarow, P. B. (1985) Proc. Natl. Acad. Sci. U.S.A. 82, 3973-3977. Reddy, J. K., Goel, S. K., Nemali, M. R., Carrino, J. J., Laffler, T. G., Reddy, M. K., Sperbeck, S. J., Osumi, T., Hashimoto, T., Lalwani, N. D., & Rao, M. S . (1986) Proc. Natl. Acad. Sci. U.S.A. 83, 1747-1751. Rosenfeld, G. C., Comstock, J. P., Means, A. R., & OMalley, B. W. (1972) Biochem. Biophys. Res. Commun. 46, 1695-1 703. Sanger, F., Nicklen, S., & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467. Solberg, H. E. (1972) Biochim. Biophys. Acta 280,422-433. Staden, R. (1982) Nucleic Acids Res. 10, 4731-4751. Young, R. A., & Davis, R. W. (1983) Proc. Natl. Acad. Sci. U.S.A.80, 1194-1 198.

Human Ethanol-Inducible P450IIEl: Complete Gene Sequence, Promoter Characterization, Chromosome Mapping, and cDNA-Directed Expressiont$* Morio Umeno,s 0. Wesley McBride," Chung S. Y a n g , l Harry V. Gelboin,s and Frank J. Gonzalez*is Laboratory of Molecular Carcinogenesis and Laboratory of Biochemistry, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, and Department of Biochemistry, UMDMJ, New Jersey Medical School, Newark, New Jersey 07103 Received May 25, 1988; Revised Manuscript Received July 28, 1988

The human P450IIE1 gene, coding for an ethanol-inducible nitrosamine-metabolizing P-450, was isolated from a XEMBL3 genomic library and completely sequenced. The human gene spanned 11 413 base pairs and contained nine exons and a typical T A T A box. Upstream and downstream DNAs of 2788 and 559 base pairs were also sequenced and compared to the rat gene. Significant areas of sequence similarity were observed within 140 base pairs upstream of the transcription start site in the rat and human genes. Human D N A 539 base pairs upstream of the transcription start site was inserted into the expression vector pSVOALAS', and luciferase activity was detected when the constructs were introduced into a rat hepatoma cell line. T h e activity was over 100-fold lower than that of pRSVL, a Rous sarcoma virus LTR-driven luciferase gene. By use of panels of rodent-human cell hybrids, the gene was mapped to chromosome 10 (CYP2E locus). A full-length cDNA, constructed with the first exon of the genomic clone and a partial cDNA clone, was expressed in COS cells and found to code for N-nitrosodimethylamine demethylase activity.

ABSTRACT:

x e cytochrome P-450 gene superfamily consists of at least ten families of enzymes that share several common features 'Supported in part by NIH Grant ES 03938. *The nucleic acid sequence in this paper has been submitted to GenBank under Accession Number 502843. 5 Laboratory of Molecular Carcinogenesis, NIH. Laboratory of Biochemistry, NIH. Department of Biochemistry, New Jersey Medical School.

0006-2960/88/0427-9006$01.50/0

among which is the presence of a noncovalently bound heme associated with a conserved cysteine-containing peptide near the carboxy terminus of the enzyme. The P-450sall contain about 500 amino acid residues and a noncleaved signal sequence and are intrinsically bound to endoplasmic reticulum membranes. These enzymes carry out a myriad of diverse biotransformations including both anabolic and catabolic reactions. The enzymology (Waxman, 1986; Black & Coon,

0 1988 American Chemical Society

VOL. 27,

H U M A N P450IIE1 GENE

1986) and molecular biology (Nebert & Gonzalez, 1987) of P-450s have been reviewed. P450IIE1 is an unusual enzyme since it is involved in metabolism of both endogenous substances and foreign compounds such as drugs and carcinogens. In rodents, IIEl was shown to be both a substrate-inducible acetone and acetol monooxygenase that may be involved in gluconeogenesis (Casazza et al., 1984; Koop & Casazza, 1985) and an efficient N-nitrosodimethylamine demethylase (Tu & Yang, 1983; Yang et al., 1985a,b; Patten et al., 1986; Thomas et al., 1987). Rat IIEl is under both transcriptional and posttranscriptional control. The 5-fold increase in IIEl protein levels after administration of ethanol, acetone, and other substances is not associated with an increase in IIEl mRNA levels (Song et al., 1986; Hong et al., 1987). In contrast, when rats are starved or made diabetic, the level of IIEl mRNA is increased severalfold (Song et al., 1987; Hong et al., 1987). However, this increase is the result of posttranscriptional mRNA stabilization (Song et al., 1987). During development, the IIEl gene becomes transcriptionally activated immediately after birth and is maximally expressed within 1 week post partum (Song et al., 1986, 1987; Hong et al., 1987; Umeno et al., 1988). In the present paper we describe the complete sequence of the human IIEl gene and analysis and comparison of the rat and human IIEl promoters. We also determined the chromosome location of the human IIEl gene and expressed the IIEl cDNA in COS cells. MATERIALS AND METHODS Materials. The luciferase expression vectors pSVOALA5’ and pRSVL were generously provided by Dr. Suresh Subramani, Department of Biology, University of California, San Diego. The expression vector p91023(B) was provided by Dr. Ronald J. Kaufman of Genetics Institute, Boston, MA. Cloning and Sequence of the IIEl Gene. A XEMBL3 library was prepared from a single adult human liver. DNA was isolated, partially digested with Sau3A, and size fractionated on a 5-20% NaCl gradient, and the 15-20-kbp fragments were ligated to BamHl-digested XEMB13 arms. The ligated DNA was packaged (Gigapak Gold, Vector Cloning Systems), and the recombinant phage were plated on Escherichia coli K802. Phage were screened with [a-3zP]IIEl cDNA labeled by nick translation (BRL, Inc.) with [cY-~*P]dCTP (3000 Ci/mmol, Amersham, Inc.). Phage clones were amplified on E. coli LE392, and DNA was isolated from CsC1-handed phage particles. EcoRI fragments from purified phage clones were subcloned into pUC9 and sequenced by shotgun cloning into m13 (Deininger, 1983) in conjunction with the dideoxy chain termination method (Sanger et al., 1977). Each base was sequenced an average of five times and at least once in both directions. Sequence data were assembled with the Beckman Microgenie program. Primer Extension and SI Nuclease Mapping. The transcription start site was determined by both S1 mapping and primer extension analyses. A 386-bp fragment encompassing the first exon and upstream DNA was isolated from a genomic subclone with the enzymes NheI and XmnI which cleave at positions -217 and +169, respectively (Figure 2). The fragment was 5’ end labeled with T4 polynucleotide kinase and [ T - ~ ~ P I A T(5000 P Ci/mmol, Amersham Corp.). S1 nuclease mapping was then performed as described (Sharp et al., 1980) with human liver poly(A) RNA. A sequencing reaction was also performed on the 386-bp fragment according to the chemical cleavage procedure of Maxam and Gilbert (1980). An EcoRI (position -539, Figure 2) to XmnI fragment was first 5’ end labeled prior to gel purification of the NheI-XmnI

NO. 2 5 ,

9007

1988

1hP450IIE1 E

subclones

1

E-6 -4000 -2000

E

E B B

-

E

E

0

B

B

E

-

E

S

2000 4000 6000 8000 10000 12000

1: Restriction map of the human IIEl gene. The map was generated by single and double restriction enzyme digestions with EcoRI (E), BamHI (B), and Sal1 (S). FIGURE

fragment. Primer extension analysis was carried out with a 25-mer oligonucleotide synthesized on an Applied Biosystems Model 380B synthesizer. The oligonucleotide corresponded to positions +144 to +169 (Figure 2). The primer was annealed to 10 pg of human poly(A) R N A and extended with AMV reverse transcriptase. The sequencing reaction, S 1 protected fragments, and primer extended material were subjected to electrophoresis on a 15% polyacrylamide-8.3 M urea sequencing gel. IIEl Promoter Luciferase Expression. The human IIEl promoter region was excised from a genomic subclone with EcoRI (position -539, Figure 2) and S m a l (position +8, Figure 2), treated with Klenow fragment, ligated with Hind111 linkers, and introduced into pUC9. The human IIEl promoter fragment was then excised from pUC9 and ligated to pSVOALAS’, the promoter-lacking luciferase expression vector (de Wet et al., 1987). Plasmids were purified by two CsCl gradients and transfected into the rat hepatoma cell line McA-RH7777 (ATCC CRL 1601) by use of the calcium phosphate procedure (Graham & van der Eb, 1973) in conjunction with a glycerol shock (Parker & Stark, 1979). Luciferase activities were measured by use of an LKB luminometer. Chromosome Mapping. Characterization of the hybrid cell lines used in this paper has been detailed elsewhere (McBride et al., 1982a,b,c, 1986). DNA from panels of hybrid cell lines was digested with EcoRI, electrophoretically resolved on agarose gels, and blotted to nylon membranes. Hybridization and washing were performed as described (McBride et al., 1986). Expression of the IIEl cDNA in COS Cells. A full-length cDNA for human IIEl was constructed by joining the portion of the first exon of the IIEl gene to a partial cDNA clone that lacked the coding region for four amino acids at its 5’ end (Song et al., 1986). This was accomplished by use of a unique XmnI site in the cDNA (position +169, Figure 2) and an EcoRI site upstream of the gene’s start site (position -539, Figure 2). The EcoRIIXmnI fragment was ligated to the partial IIEl cDNA clone at the cDNA’s XmnI site, and the upstream region of this construct was removed by digestion with BglI (position +29, Figure 2). After treatment with T4 DNA polymerase and addition of EcoRI linkers, the final cDNA containing the complete coding information for IIEl was inserted into the EcoRI site of p91023(B) (Wong et al., 1985). The construct, designated p91023(B)-IIEl, was transfected into COS 1 cells as described (Luthman & Magnusson, 1983). Expression was quantitated by immunoblot analysis (Towbin et al., 1979) using anti-rat IIEl (Song et al., 1987), and N-nitrosodimethylamine demethylase assays were performed as previously outlined (Yo0 et al., 1987). Other Methods. Southern blotting (Southern, 1975) was performed by the alkaline transfer method (Reed & Mann, 1985). DNA was digested, electrophoretically resolved on 1% agarose gels, depurinated, and transferred to nylon membranes

9008

BIOCHEMISTRY

U M E N O E T AI..

-2780 -2700 CCCCCATTGAAAAATTGTCTTTCTGATCTTTATAAACAATTATTTAATATCCAGTAAAATCTTCTCTATATTGCTTTACTAGTGAGTTCTATTAAAATTTTGAAGCACAGAAAATTCCCC -2600 TACAGTATAAAGTATCCCCAGTCACAGAGAAGACAGGGGTTTTGCAATGATTTCTAGAATAGTGCAATTTTTATGCAAGAACCTAATATAACACAAAAATTATAGCCCGATTTTATTTGT -2500 GGGTATAGATGCAAAATTACTAAAAATACTATTAACAAGTTGAATCCTTAGGGTGTTAAAAGAGTATCACTCCATGAACGAGTTGGTTGTGATGTGGAACTATGAGGTACTTTTATGATA -2400 CAATATAAAAATTTATGGTAATTTTATGGTACATTGTGAGACAGTGTTTTCTTCTAGCATCATACTAGCAGGTCTATGGAGAAAAATCACAGGATTGTCTCAATCAAAAAAAGATTTCAT -2300 -2200 TAACCCAACTCTCATCCCTGATAAACACTGTTAGTTATCTAGAGAAAGAAGAAAATTGTCCCAATACAGTCACCTCTTTGCCACACCCAGCCAACAGCAGACGTGATGGAAGCCTGAAGA -2 1 0 0 ACACCCTGCCACGGGCACAGGCAGAGGCACAGGCACCCTGTCGTCCTGATTATTTCACCTTGTCACGGGCAGAGGCACAGGCACCCTGTCGTCCTGATTATTTCACCTTGTCACAGGCAC -2000 AGGCACCCTGTCGTCCTGATTATTTCACCTTGTCACAGGCACAGGCACTCTGTCGTCCTGATTATTTCACCTTGTCACGGGCAGAGGCACAGGCACTCTGTCATCCTGATTATTTCACCT -1900 TGTCCTAGAGTGTCCTGCCAATGGGACAGATGCAAAACAAATAAAAGCCCCGGCTTCTGAAAAGAAGCACACAGAAATGTCATTATTTTCAAACGAGGTGTTCCCGTATATAAAATTTGA -1800 TGTTGGTTGGGCATCTAACAGTATTATTATGGCCAGAGGACTCAGACCACAGCTGCATCCCTGTGAGGCACAGACTCTCCAGGGCACGCGGGTCCCGCTGGGATGTGCACACTCAGGTGAGCT -1 700 -1600 GCACAGACAAGGTGTCCTCAGCCCAGGGGAGCCAGAGGCCTGCTCTGCCTCTCCACCCTGATGCTTCCTGTTCTCACCCCACCAAAGCCAAGGCTTCAATTTCAGTCTGTGGGGAGCTGA -1500 CTCTGCTGCTCTCAAGCACTAGAAGAAGGAACCAGTAATCGAGGAAACTTGTGGACCCCAATGGTGTCTGTCCCGGCCAGGCCTGGCTGGGCCCACACAGGACAACAGGGTTCAGGGGTC -1 400 TGGACAGCTGTTTCTGCCCAGGGAATTGTCCCTGCCACCTCACACTGGCCACTGGAAAGGAAAGAGAGGAGGAGGCGGCAGGCTAACCCACCCGTGAGCCAGTCGAGTCTACATTGTCAG -1300 TTCTCACCTCGAGGGGTGCCAAAAACCAGAGGGAAGCAAAGGCCCCTGAAGCCTCTGCCAGAGGCCAACGCCCCTTCTTGGTTCAGGAGAGGTGCAGTGTTAGGTGCAGCACAACCAATG -1200 ACTTGCTTATGTGGCTAATAAATTGTCAAGAGAAAAACTGGGTTAGAATGCAATATATAGTATGTAGTCTCATTTTTGTATAAATACAAGTATAGAATGGCATAACTCAAAATCCACAAG -1 100 -1000 TGATTTGGCTGGATTGTAAATGACTTTTATTTTCTTCATTTCTCATCATATTTTCTATTATACATAAAGATTCATTGTTAATATAAAAGTACAAAATTGCAACCTATGAATTAAGAACTT -900 CTATATATTGCCAGTTAGAAGACAGAATGAAAAACATTCTCTTCATTCTAACCACACACACAAAAAACTCCACAAAATACCTATGGACTACCTTCATAGAAGGTGGAAGAGGGTCTGTA~ -800

GAAGAAAATGCTTAATACATGAAAGAAGAAGCTAGTCAATGTGGAGGTCTATTGTGCGCCGGGATCAACAAAGACAAGATATGTTTAAAATGGTGTTCTAAATTTACCCTAATGTAAAA~ -700

AAATCCAATAAAACTCTAATGTGATTTTTTAAGAATTTAAATTTGGAATAATTCCAAAGAACAATTTTTCTTAATTTCTACAGCCAGAATATATACCTTTAAAAAAAATGAAAACAGAG~ *aa -11.1 TTAACTTTCTCAGAATTGGTTGACTCACTCTTTCCTTTTATTTTTCTTCCATGGAATTTTCCAGTTAACTTGAGAAAGTGGAATCGAATTCCGATGTTGAATTTTCCTTCTGGCCCCATT -500 -400 CATGTGGCAGGTGGTGATTCAGGTACTACTGGGGGCTGCTCAGACAAACCTCCTCATCAGACATCAAGAGGCTGTTGCACCAGGAGGGCCGGTACCGTGTCTAGAGGTGGTCGGCATGGG -3RR

GTTGGAGTTGTATTACATAAACCCTACTCCAAACAAATGCATGGGGATGTGGCTGGAGTTCCCCGTTGTCTAACCAGTGCCAAAGG~~~GGTCGGTACCTCACCCCACGTTCTTAACTAT -200 GGGTTGGCAACATGTTCCTGGATGTGTTTGCTGGCACAGTGACAGGTGCTAGCAACCAGGGTGTTGACACAGTCCAACTCCATCCTCACCAGGTCACTGGCTGGAACCCCTGGGGGCCAC -100 CATTGCGGGAATCAGCCTTTGAAACGATGGCCAACAGCAGCTAATAATAAACCAGTAATTTGGGATAGACGAGTAGCAAGAGGGCATTGGTTGGTGGGTCACCCTCCTTCTCAGAACACA

T ~ A A C C T T C C T T T C C A C ~ G G ~ T T - G T C C T C C C G G G C ~ G G C MA GSC AAG GL GGC CVC CT A G CAG GL C AL CVC ~WT CAT GA C CF C LT CLG GL A G T C A C C G T V 100 200 GGTGTCCATGTGGAGGCAGGTGCACAGCAGCTGGAATCTGCCCCCAGGCCCTTTCCCGCTTCCCATCATCGGGAACCTCTTCCAGTTGGAATTGAAGAATATTCCCAAGTCCTTCACCCGG

V

S

M

W

R

Q

V

H

S

S

W

N

L

P

P

G

P

F

P

L

P

~

I

G

N

L

F

300

P

L

E

L

K

N

I

P

K

S

F

~

R

GTAAGAGAAATAGTGTTGATTTTAGGGAGAATAACTCAGCAATTGGATCTGGTATGTGTGTATTCAACTCATTTGCAGACAAATTGTGGTTGTTCAATACCAGCCTGTTGTGAATTACC

400 TGAATTGATAGCATCCTGGAGCGACACTCAAAATGTGTCGCCTGTGGTGCAGCTGGAGCCCGGAGCCTGCGTGCCAGGCCCCGGAGGCCCCCGCCGTGCCTTGTCCTGGGGCTGATGATG 500 GGGAGGCCGGCGAGGCCGGGCTGCTGCGACGCCAGGATAACCGGGCTGGCGGCCAGATGCGCACTCGCTGGGCGTCCGCCTGTGTTTGCCAAAGCACGAGTTGAAACGTGAAGTGTTGGG 600 CCACCCCGTGTCGCACCAATACCTCCCGCCTACGACTGTTGTGAACACTGAATGGGCCAACAAACCTAAACGTTAAATGAACTGATAACGCCGTCAGCACGGAGCAGGCGCTGGGTGTTT 700 800 GCGCTCTTGCGCGTGCGCTGCTGTGGGGCGCAGGCTGACGGCGGGCGGGGGTCGCCTGCTCCAGCTCGGGCTCCCGCGCCAGAACCGGGTCCAGAACCTTGATTCCGGAAGCGGGCAACG 900 GGGTGGTTGGTGGGCGCGCCTGAGGGAAGGGACGTATTGAGGAGCCGGAGTCCGCGGAGTTGCCGCGGAGTTGTCCGCGGAGTCCAGGCGGGTGGGGAGCAGAGCAGCTGGAACCCCCCGAGC 1000 GCCCTGCAGACGCAGCAGCCTCTTGAGGGGAGGGTCTCCCCCACCTCGGGCTGGACAAAGACAGCTTTTCCCCACGTCCCTCTGGGTTCTCTAGAGCAACAGCAATACCCGCCCGGCAGG 1100 T G T G G C T T A G A G C C C C G C A C C T C C T C G C C G C G C G C G G G C C T G A C T T C ~ A G C C A C G G G T C T C C G C A TGT C G C C C A G C G C T T C G G G C C G G T G T T C A C G T C C G C T C G C A G C G C A T G L

A

P

R

F

G

P

V

F

T

L

Y

V

G

S

P

R

M

1700 G T G G T G A T G C A C G G C T A C A A G G C G G i C ~ A G G A A G C G C T G C T G G A C T A C A A G G A C G A G T T C T C G G G C A G A G G C G A C C T C C C C G C G T T C C A T G C G C A C A G G G A C A G G GGTGAGTCCGCGTCC V V M H G Y K A V K E A L L D Y K D E F S G R G D L P A F H A H R D R 1300 1AAR

CTGGC~CCGAGCGGGGGGTGCATAACACGCCCCGGGACAGTTACGGGCGCTAGCCACGTCGGCGATGGCCAAATAATAAACTAACAGTAATATTATTATAGTAATAGCA~~CGAAGGATGAGA 1500 TCAGGATTAGGCGATGGCCCCCGCGCGTTGCCTGCCGAGCGAGGCGCACTGAGTCGCCCAGGAATCCGGCCTCTCGGCGACTGTGCGGGAGAGTTTTATGGGGATGGGCGGGGCTGCTTC 1600 TGAGCAGGAGTCGCCGCCCCCACCCCCACCGTTCCGCCTCTGGGCCGCAGGCTCCTCCCGGGAGCGCTTTCCCCTCCTGTTCAACCGCCGGGGTACAGGTGGCTTCGTCCACCGAGGTCC 1, 7 R R" , _

CCTCACCCACGCTGAGGCGTCGGAAGCTGCGGACACTGCTCGCTTCAGGGCTTTGCTCAGCTGCAGCTGGTGACCTCCAGAGAGGGAGTCTCTGATGTCCCGCTGGGGTGGATGTCCTGA 4

URR

GACCGGGAAGGGGGAAGAGACCCAC~GXAATCCTATCTCCCAGCCTCACCTCTGCTGTCTCCTCCACGCTTCCTGTCTCCAGAGC

1900 2000 CTCTCTAGGGAGAGGAGGGTTGCGGTCTGGAGGTCTGGCTCGTCTTTATCTGCGCATTCTCCCAGCCTCCTGGCTTCAGACCTCAGCGAGGCGGCGGCTGCGGCCGGCTCTCCTCTTCCT 2100 GCCTGCAGACCTGGCCTGCTGCTTCTTTCTCCTTCCTCCCTCCCTGCCTGCCCTGCGGTTTCAAAGTAGATTAGAAATAACAGTGTCCCACATGGAAGCCTCTACTTCTTCCTGGGTCAA 2200 CTTTGATGACGAGGCTCCAGAAAACCTTTGCAATGCTGTGTGGAATTTTTAAATCGGTGAGCTCGTGCTCTTGCCCTATTTATTTGTCCAGCGTACATTTCTGAACATTGTGAACGTCGA 2300 ATGGGCCAACAAATCTAAAAATTAAATGAGCTGATAAAGAACGCCGTCAGCACAGAGCAGACGCTGGGTGTTCGCGCTCTTGAGCGTGCGCTCTGCGGGGCGCGGGCTGGTGGCGGGCGG 2400 GGGTCGCCGGCTCCAGCTCAGGTTCCCGCGCCAGGACCGCGTCCAGAACCTTGTCTCCGGAAGCGGGCAACGGGGTGGTTGTATCACAATTAGTGGCATTTGGTTTTCCTTCTTCTGCAT 2500 2600 TGTGGGTTTTACTTCTCTGGGGTTGCCAAAAACAAAATTAACCATCTCAGTCCTTGTCGTTAACGCAGGAGAAGCATTACTGGAGGAGGCTCTGGGGTTCTGTGGTTGAGGAGCTCAGT?

2700 CTGGTTCCGGGGAGCCCTTATCTGCCACCCACGGGTCCAAGGCACAGTCGGAGGCAGCAGGGAGGGGAGCGGAATTCACATCAACACAGATGGGGCTCAAGGGGACTTTGCTGCCTCTGC 2800 CTGGAGGGTCTAAAGTTTCATTTTCATATGACCCGCAGGGCGCAGACTGGCGGAAAATTAGCAGAGCCCTGGGCATGGGCTGCACCTGGCCTTAAGGGACAATGATGGAAATATTCCTTA

2900 TTAGCACAATACTGAGCACAGGCTGTGTGATAATGTGTCAAGGGAACTGCAGACATCCTTTCAGAAAAAGTTCATAAAACGGAGAAAGTTTGGTTCCCAACCTAGATTTTTAACCTGTTG

3000 AACTCTGTCTAAATGGGTCATCTCGGGATGTCCTCCACTCAACATGACCACAGTCTGCCCCTCTGTCCCACCTGTCTCCTCAGTCCTTCCTCCCCACCTTTCAGGATGAAATGAAACCCT

3100

3200

CAGTCCAGCTGCACCCCTGCCCCACCCACCTCATCTCATGTGCCCTCCCGCCCCTCTCAGGCCGGACAGCCTTGCTTCTGGAACACACGAGCACAGCTTCACCAGGCACTTTCTGAGCAC

3300 CCTGCAGGCGCCTCCCAGGAGTGGTCAGTGGTCAATCAGCTAATGAAGCTGCATAGGACATGACCCTTGTTTACCGCAGAATGCCCAGAGCTGGCAGGATGTCTTATATGCAGGAAGTAC

3400 CCAAAATGTATTTATTGAGGAAGTGATGATGGATAAGAGGAAGACGGAGAGCGAGGGAGAGAGGGGCTAGGGGCCCTGCGGTGTAAAGGGGGTGTGGCTGGGAGTGTGCAGGGGAACAGG

3500 GATCATTTCAAGGTTCCTATCTGGGAGAAAATAAAAAGGTTTACAGTTAGTTGAGATAAGCGTGGGAATATGCGAACATTTTTAAAGAATAAAAAGTTTAGC~TTAAATTTGTTGATTCC

3600 AAATGTGTTCATACTCTCGGGAGGATCCATCAAGCAACTCTTGGGAGGAGAGACAGGGCAGGGCAGGCCTTGACAGCTCAGAAGGGCGCAGTAGGGACAGTTCTTGGTTTTCCCAGCTC~

3700

3800

GATGCTTTGCACAGTCGCTTGTGTGACCTGCAAGATTTTAGTGAAGAAACTTGCTGTGGAGTCGGAAAGCTGCAAGTTGAGGTGTGTGTGGTGTGAGGGTTAAAAATCTGTGAGAACAGA

3900

ATGAATGGCTTTTCAAGAATGTTGTCGATAGATAGGAAAGAGGTGGGAGGTGTTCTTGGAGTGGCCATATGTGGTTTTATGTAGCATGGGGAAGACTCAGCAGAAAGGAAAAAGAAAGAA

4008 GGTAAATTGACAGCATGAAGTAGAGCACCCAGGAGAGGCTACATGTGATGAAGAAACCACAGTGCAGACTGTGAGGACCCCAGAAAGGCTCCTCCCCAAAACCTGACCAGTGGCCGGTGC

4100 TGGCAGCTCCCAGGCTGGGACACCCTCTGTCTCTCTGTCCCTCTGCCCCCTCTGTCACTTCTTTATACACCTGTAAATCCTGCCCTGCTCTCCAAGGCCCTCTGTAGCCCATTTC~CCC~

9009

VOL. 27, NO. 2 5 , 1988

H U M A N P450IIE1 GENE

4200 AAAATGGGTATTTAGAATAACCTTCTGCTGGCCCCTCTGCCTTAG GAATCATTTTTAATAATGGACCTACCTGGAAGGACATCCGGCGGTTTTCCCTGACCACCCTCCGGAACTATGGGA G I I F N N G P T W K D I R R F S L T T L R N Y G 4300 4400 TGGGGAAACAGGGCAATGAGAGCCGGATCCAGAGGGAGGCCCACTTCCTGCTGGAAGCACTCAGGAAGACCCAAG GTGCGTATCTGCTGCCTAGCAGGGCCCAGTCCTCTTGCAGACCAG M G K Q G N E S R I P R E A H F L L E A L R K T Q 4500 CGGTGTGGGGAGCCCTGGCTGGGACTCCTAGACTGCATCTGAACCACAGGGACCTACGGACAAGGAGAGGGTCTCGTGAGTCCCCAGATACTGCATTTTACAACTCTAGGTTCCAGCTAC 4600

ACAGTTCAGGGAGCAAGGGTGGCCATTAAACACGTGACTTGTATCCTAAATACTGTTGAAAAGCAAAGGAAACTCAAACAGGTTCAGACATTCACTATCTTTCGTAAACTGGCAGTTTTC 4700

AGGGCACCTTCTCACAGGCCTTGGTGAACCTCAGTGGGTGACTGAGCAGGTGGAGGAGTCTCCTCACCCCCATCTTCTGGTTGCCCTGACTGCCTGTTTTGTAG

GCCAGCCTTTCGACCC G Q P F D P 4800 CACCTTCCTCATCGGCTGCGCGCCCTGCAACGTCATAGCCGACATCCTCTTCCGCAAGCATTTTGACTACAATGATGAGAAGTTTCTAAGGCTGATGTATTTGTTTAATGAGAACTTCCA T F L I G C A P C N V I A D I L F R K H F D Y N D E K F L R L M Y L F N E N F H 4900 5000 CCTACTCAGCACTCCCTGGCTCCAG GTGAAGCCACTTTCCTCTTTCATCAGTCATCAACTGTAGAGTTTACGTTAGAAAAAGAAGGAAAATTTGGGTTATATGTGATAGACAGGACTGCA L L S T P W L Q 5100 AAAGCCAAACAACATAGCTTCGAGGGGTGTTTGATTAGACAGCCCAAATATTCCTCCCAGAGACATCTCTGGGGCCCCACGCACCCCCTTTCCTAACGTCAGGATGTGTATCGACCTGTG 5200 TGTGCACATTTGCCATGCAGAGTTTGCACTGCTGAGGAGAATGGTGCCCAAGAAGGACACTGTTGACCCAAAATATTCCAAATAAACAATGATTACAGCCACAAATTCAGGTTTGGAGAA 5300 AGTTGTTGGTCCAACACACACAATTATGTTGCATCCAGAAAAAAGTAGTAAAATATTTTTTTCCCTCTCTAG CTTTACAATAATTTTCCCAGCTTTCTACACTACTTGCCTGGAAGCCAC L Y N N F P S F L H Y L P G S H slam

AGAAAAGTCATAAAAAATGTGGCTG;;GTAAAAGAGTATGTGTCTGAAAGGGTGAAGGAGCACCATCAATCTCTG R

K V I K N V A E V K E Y V S E R V K E H H Q S L D P N C P R D L T D C L L V E 5500 5600 ATGGAGAAG G T A G G C T C G G C C T C C C A T G A T G T G G G C T C T C C G G G G T G G G C A G A G A A T G C A C A A T T T C A G A T T T A C A G A G T G A G C T G C A C T T G C T G G T G T C C A G A C C T C C C A C C G C A G C A M E K 5700

GCTCTGAGTTTCATACACACACTCTTGGCTTCAGCATGACCACTGGACGCAAGTCAGCCTGCCTGGCTGCCAAGCTGGCCTGGGGTTTGGGGCACATGGGCGGGACGCTTAGCTCTCTCC 5800 AGGCCCTGCTGCTCAACCCTTTCTAGTCTGCAGACTTTGAGAATTGCATTTTGTCTGAGGAGAAGCCCTCAGCCTTCCTTGTGGGCATGCACTCCCCAACTGTGCGCACGTGCAGGACT~ 5900 CCAGGCCTCCCCAGCTTCATCCACCTGCAGGTGCTCAGGATCCTGATCCCCTGCCCCCTTCCCACCTTGGTGAAACTTCTTGTATCCTTGTCTTGTCCTTTCCTATGGCTTGTGGCTCAA 6000 GAACAAATGTGGAGCCCACACTGATTTCCCAGGACTGTCTGAGCATCTTCTCCACCAGTTTGGCCCCTCGTGGCAGCAGACACTAGCCCTGTAGCAGGAGGGGTTAGCAGGAGCCGTTTA 6100 6200 GCTCCTGCCTGAGCTATGACCAAGGTCAGGGGGATCTCACCTCTCCCAGGATGGCCCTCATGCTGTGGAGGGAGACAGAGCCCTGGCCTGCCCTCAGCAGATTTCTGGGAGCCTCAGTTT 6300 CCCTGGCTGTGAGTGGAGATGACTCTGTCTGTCACAGCTCCAAGTCACAGTTCCACTGGGAGAGCCTCTTGGACACTGTCTCCTGTGTCCCTGTGGAGCTGGGAGGTGGCTGGTTCTGTG 6400 CTGkAAGGAGACAAGCAGCCCCTTCTCTCCGGTCTGTCTCCGGTATCACAG GAAAAGCACAGTGCAGAGCGCTTGTACACAATGGACGGTATCACCGTGACTGTGGCCGACCTGTTCTTT E K H S A E R L Y T M D G I T V T V A D L F F 6500 G C G G G G A C A G A G A C C A C C A G C A C A A C T C T G A G A T A T G G G C T C C T G A T T C T C A T G A A A T A C C C T G A G A T C G A A GG T A G G C A A G T G A C T G A A G G G A C A C C G T G C G T G C G G C T G C A T C T C C C T A G T E T T S T T L R Y G L L I L M K Y P E I E 6600 GGATGGCCAGCCTTGCACATTTTAGGCTGCAGCTTTCTGTCTGAAGCTGCTTGTTAACCCTCATGGTGATGTGGTGAGATGGCTGGATGCACTGCTGTGAGGGGAGGTGTTATGGTCTGT 6700 6800 GCTGAACACTGGTACTCTTGCACACTGGTTGGTCCATACCCCACTAAGACACCCCTGGTTGCAGAAAAGAACATCCCAACACCAGAGTGGAGAGAGGTGGCAGGGTCTGCATTCTGCTCC 6900 ATAAATAACCTCTTTATGACAGAGAAGATAATGTCCCAGTTCCCCCCAAGTAAGACCTGGTCTTCTAGGCAGAGCAGGTGGGGAGGTTGGAGCTGGAGGGGAGGGTCCTTGCTGGGGCGT 7000

CTTCCTCAAATGCGGACGTGAGGAGGGAAGTCCAGGAAGAAGCAGCTACAGCTCCCCCTGGACCCTTGTCGTTCCTTCCACAGGGCTCCTCCCAGCGGCACCTGGGGCAGCTGGGACTCT

7100 GTGCCTGGAGGAGGTGTGAAAGGTCTGGGTCTAGGTGGGCAGAGGGTCATGCCCTGAGAAACACCCATCTGGGCCAAGTAGAGGTGATGTGAGGGCACCGCATGCAAACAGGCCAGTCAG 7200 GGTTGGGTCCAAGTAAAGGGGAGGAAAGGGAGCTGCAGCCTGGCTGGAGAGTGCCGGGGGGCCCAGAGCCCCTGCCTCTCGCTGGGCTGGAAACAGGGCTGGGCAGCCTCTGCCCGAGGC 7300 7400 AGTTCACAGCCTGAGTGGTGTGTGCCGCCCTCCTCCTGAAGCTGCTGCTAATGGTCACTTGTGGTCTTAAGGCTCGTCAGTTCCTGAAAGCAGGTATTATAGGCTATGAAGTTATTTCCC 7500 CCAAGAAAGTCGACATGTGATGGATCCAGGGTCAGACCCTGGCTTTTCTTGTTCTTTCCTTCTTCTTCTTCTTTTTATTTATTTATTTTTTTTTTGAGGGGACAGGGTCTCACTCTGTTG 7600 CCCAGGCTGGAGTGCGGTGATGCAATCATGGCTCATTGTAGCTTCTACCTATTGGGCTCAAGCGATCCTCCCACCTCAGCCTCCCAAGTAACTGGGCCACAGGTGCACACCACCACACCC 7700 AGCTGATTAAAAATTTAAAAAAATTATTTTGGCTGGGCACAGTGGCTCATACCTGTAATCCTGGCACTTTGGGAGGCTGAGGCAGGCGATCACGAGGTCAGGAGTTCGAGACCTTCCTGG 7800 CCAACATGATGAAACCCTGTCTCTCCTAAAATACAAAAAAGTAGCCGGGTGTGGTGGCACGCGCCTATAGTCACAGCTACTCAGGAGGCTGAGGCAGGAGAATCGCTTCAACCTCAGAGG 7900 8000 CACAGGGTGCAGTGATCCGAGATTGCACCCCACTGCACTCTAGCCTGACAACAGAGCAAGAATCAGTCTAAAAAAAAAATTGTAGAGACAAGTTGTTACTATGTTTTGTAGGCTGGTCTT 8100

GAACTCCTGGGCTCAAGTCATCCTCCTGCCTTGGCCTCCCAAAGTGCTGGGGTTACAGGTGTGGCCACCGTGCCCCATCCCTGGCCTTTGCTTTTTCAATCACATGGAAATGTGAAGGGT 8200 GAAGGAGCCAAAAGTTTAGGGAAGGAATCATTGTATGGATCTGCAGTGATTATAAGAGAACTTTCGACTACTCTGCACTAGGGGAACCATGGAATCAAAAAATGTTTTAAATTATTATTT 8300 ATGAGGAGGTTCCAATATAGACAAAAGGAAAATAAATATGATTGACATGTATATATCCATTGCCAAATTGAACGTTTATTAACATTTTGCGATACTTCCATCAGAGCTCTTAAAAAGAAA 8400

ATGTGTTACAGAGCCAGCCAAAGTCTACCTCCTCACATCTCCCCACCTCTCTCACCAGAAATGGCTTCAGAATTGCTGTGTGGCTTTGCACTTTTAACAGTTGTTAATTATCAGCACAGT 8500 8600 ATTCATATTATTGCTGTATGTGTTTAATATTTTACCTGGGTACTGTACATAACATTTTGCAGCTTGGTTTTTTCACTCAACATATGATGATGTTCCATGGGAACTCCAAACACGGGGAGG 8700 CTAGGCGACTTGCTCAAGGCAGCTGTTACCTCTGTCAGAAAGACAGAGGCTTTCAGATTCAAGAAGTAGACCCTGCATGTCTGATTCTGTTCTGTAAACCCCCTTCATACTCAGAAGCAT 8800 GCAATAAACAAGCCTGGGGTAATTATCAATGCAAAGGTTACCCTCCCAGAAGAAATTTCCAAAACACTTTCATTATTCTCTGCTCTTGACATGAAGAGAACTGAATAAGCCATCATCAAC 8900 TGAGATAATGGATGCCAAAACATCCAGTAAATAACCTCATAGAGCTTAGCTCTCACTAAGTTTTTGGAGCATTTTCCAGTAATTCAAAGGACCTGGGGAACCTTAAGCACTGCTTAGGAT 9000 GCTCCATAAACATCTTCTGCGTGGGTAGGGGAGTGGATGGATGGCTGGATGGGTGGGTGGATGGACGGACGGATGGATGGATGGATGGATGGATGGATGGTTGGATGGATGGGTGGGTGG 9100 9200 ATGGATGGATGGGTCAATGGATGTGTGGATGGATGGAAGGGTGGGTGGATGGGTGGATGGCTGGCTGGTTGGGTGGGTGGGTGGATGGATGCATGGGTGGATGGATGGAGGATGGATGGA 9300 TGGATGGAGGGGTGTATAGATGGAGGGGTGGATGGATGTGTAGGTGGGCAGATGGATAAAAGCGTGATTGAATAGATGGGTGGATGATGGGTGGATGCCCAACTGGCCAGGAACCAATCC 9400 CTGAAATTTGTCCCATTCATATCTTGGCAG AGAAGCTCCATGAAGAAATTGACAGGGTGATTGGGCCAAGCCGAATCCCTGCCATCAAGGATAGGCAAGAGATGCCCTACATGGATGCTG

E

K

L

H

E

E

I

D

R

V

I

G

P

S

R

I

P

A

I

K

D

R

Q

E

M

P

Y

M

D

A

9500 T G G T G C A T G A G A T T C A G C G G T T C A T C A C C C T C G T G C C C T C C A A C C T G C C C C A T G A A G C A A C C C G A G A C A C C A T T T T C A G A G G A T A C C T C A T C C C C A A G GTTAAGCAATGAGCCTGCAGCA V V H E I ~ R F I T L V P S N L P H E A T R D T I F R G Y L I P K 9680 CACAGCATGAACACCATCCTATCACTAATCGCCTTCCTGCCAGGGAGCAGGATGGGGGCCCCAAGACCCTTCCCTTTGGCAGGGGTCACTGAGGGGAAGGGCTGGCCCCACTCCCACCCT 9700 9800 GTGGGATACTGCATCTCCAGGAGTGCTCACATTGGCCTGGTGACCAGAGAGGTGGAGGAAATCTGGAAAAGAGCCTCAGCAGATAGTGCCTGGGACTGTAGTGAATTCTAATGCCAGGAA

9900 CAAACTATCACAACCAGCCCTGGGGTTAATCCTGTGAGAAGATTAGGGCTTTCATCTTCATTTAGACCTGACCCCTGACTGCTTTCTATCTAATCCTTCACTAAGCAACTCCTTCAACTC

10000 GAAATATACTATCCTATATAGCATAATATTCAAAACAACATTCTTCACTGGGGGTTTCCAGATGAAAGCCCACATTTTGTTAACATGACTCACTGAGACAGTCTTTGTTTCTCCTAG 10100

GGCACAGTCGTAGTGCCAACTCTGGACTCTGTTTTGTATGACAACCAAGAATTTCCTGATCCAGAAAAGTTTAAGCCAGAACACTTCCTGAATGAAAATGGAAAGTTCAAGTACAGTGACTA G

T

V

V

V

P

T

L

D S V 10200

L

Y

D

N

Q

E

F

P

D

P

E

K

F

K

P

E

H

F

L

N

E

N

G

K

F

K

Y

S

D

Y

TTTCAAGCCATTTTCCACAG GTGAGAAAGATCAGAGGCAGTACCTTCCCTTGAGGAGCAGCCCACACTCCTCATCTCCCCTCCACATGTGCTCTGCCCTCGTCCCAGGCACCCACTGACAC

F

K P 10300

F

S

T

10400

CCCAAACCTCACTGTGTGCCCTGTTTCTATTGACAACATGACCCAAATGTGCTCTTCCCTGTTCAGAGAAGTTACATAACATCTTTTAGCAGCAATCCTGGGAATGAAGTGTTGTAGGTG

10500 GATTTTTTTTTTCCCAAAGACTAGACATTTTACATCATTCATTGCTAAATTTTGTTTCTATTTTAACAAGACTTAGTGAAAAGCTCTCAAAGCCATATTACCCAATTCTCCCTAATTTTA

9010 B I O C H E M I S T R Y

UMENO ET A L .

10600 AACCAGAGCTACTAAACAAAACCTAACCTTTGGTTACCTAGAATCATCACAGGAAGCATCAAAGCCTTCCTGGGATGTGACTCAGTGATTTTCTTTGAGGCACTTGTCCTCCTTCCCAGG 10700 GCCTCATCTTAGGGATTGTTGTGGGAAGATCATACAACCAACTCCATACTTTTCACACCCAGTGCTGGAGCCCCAGCTTCTAACAGGGCACTATTTCCCTCCTGTAGGCATCACTGATGA Iaaaa GCACTGGGGGTGCCTTCTTTACTGGGCAGACATGGTCTTCCCAACTTAACACCGGTTTTTGCAGTTGAGCTCTGGATAATTGAGATTGTATGAAGGCTGGTCCCCGAATTAGTCAGTGTC 10900 11000 GCTGGTATCCTTCCACTCAAGTACATTTTGTGCTTCTTTTAATAGGCAGAGAGGGGTGAGTCCTGCCCTGTGATGGCCGTTTGCCCACAGCCTCCTCCTCCCCGCTTCCCCTAGTCTCAC

.----

--

I, I I R A

TGTTAACAGTGTCGTGTCTCTGAAACTCCCTCAGTGTCTCATCAATACCATTGTTACTTCTAG

GAAAACGAGTG

G

K

R

V

GCTGGAGAAGGCCTGGCTCGCATGGAGTTGTTTCTTTTGTTGT

Y

A

G

E

G

L

A

R

M

E

L

F

L

L

L

1 1 200 GTGCCATTTTGCAGCATTTTAATTTGAAGCCTCTCGTTGACCCAAAGGATATCGACCTCAGCCCTATACATATTGGGTTTGGCTGTATCCCACCACGTTACAAACTCTGTGTCATTCCCC

C

A

l

L

P

H

F

N

L

K

P

L

V

D

?

K

D

l

D

L

S

P

l

H

l

G

F

G

C

l

P

?

R

Y

K

L

C

V

l

?

1 1 300 GCTCA~GTGTGTGGAGGACACCCTGAACCCCCCGCTTTCAAACAAGATTTCGAATTGTTTGAGGTCAGGATTTCTCAAACTGATTCCTTTCTTTGCATATGAGTATTTGAAAATAAA~ R S * 11400 +1 +50 A T T T T C C C A G A A T A T A ~ T C A T C A C A T G A T T A T T T T AA C T A T A T G T T A A G T C A T G G A A T A T C T T A A T T G T T T A A G T G A T T C T C A C A G A G A G G T T T T T T T T T T T T T T T T T T T T T T - T +150 TGAGAGTTTTGCTCTTGTTGACCAGGATGGAGTGCAGTGGCATGATCTTGGCTCACTGCAACCTCTGTGTCCTGGGTTCAAGTGATTCTCCTCCCTCAGCCTCCCGAATAGCTGGGATTA +250 CAGGCACCCACCACCATGCCAGCTAATTCTTTGTATTTTTAGCAGAGACAGGGTTTCACCATGTTGGTCAGGCTGGTCTTGAACCCCTGACCTCAGGTGATCCACCTACCTCGGCCTCCC +350 AAAGTGCTGGGATTACAGCATGAGCCACCGCGCCCAGCCAGAGAGAGGTTTTAAATATATATGTTTACTTTAATATTAAGTTATAACATAATTTTCATGTTATTGAAAAGCTCTTCCATC +450 +550 TAGGATCACACCACTTCAGTGTCAGAATCATATTGAGGTGGGGAATTTGTATTAGTCAGGTTTCTCTAAAGGGACAGAAACAATAGGATAGATGTATATACGAAAGGGAGTTTATTAGGA +575 GAATTGACTCACATGA

FIGURE 2: Sequence of the human IIEl gene. The complete sequence of the structural gene and flanking regions is displayed. The mRNA cap site, determined by S1 mapping and primer extension, is designated + l . DNA upstream of the cap site and downstream of the mRNA poly(A) addition site is numbered beginning with -1 and + 1, respectively. The TATA box, initiator methionine, heme binding cystine, termination codon, and consensus poly(A) addition signal are outlined by black boxes.

Table I: Comparison of the Exons and Introns of the Rat and Human IIEl Genes intron size percent percent (bP) similarity' similarity' human rat 5' flanking -1500 to -1000 37 -1000 to-1 50 exon intron I I1 111 IV

75 75 78 82

V

75

VI VI1

84 83 84 76 (coding)

VI11 IX 3' flanking

+1 to +500

1 2 3 4 5 6 7 8

34 36 30 33 40 32 48 25

904 2938 388 406 881 2833 499

882

761 3405 305 325 788 1873 496 776

25

'Percent similarity was determined with

the NUCALN program of Wilbur and Lipman (1983). Exon IX similarity was computed through the coding region only. The 3' noncoding region showed no significant similarity.

(Biotrace RP, Gelman Sciences). Membranes were prehybridized and hybridized a t 50 OC in Hybrisol (Oncor) with nick translated probes. Washing was performed a t 65 "C in 0.3 M NaC1, 30 m M sodium citrate, and 0.2% NaDodS04. Under these conditions only sequences with less than 20% divergence are detected. Protein electrophoresis was performed as described by Laemmli (1970). Western blotting was carried out by the method of Towbin et al. (1979), as previously outlined (Song et al., 1987). RESULTSAND DISCUSSION Isolation and Sequence of the IIEl Genomic Clone. A human liver D N A hEMBL3 library was screened with the IIEl cDNA probe, and several clones were isolated. A map of the clone containing the complete IIEl gene and the restriction enzyme sites used for subcloning and sequencing are displayed in Figure 1. The clone, designated hP450IIE1, contained an insert of approximately 17 kbp. The complete sequence of the human IIEl gene was determined by shotgun D N A sequencing (Figure 2). The gene spanned 11 413 bp from the start site to the poly(A) addition site (determined from the cDNA sequence). D N A of 2788 and 559 bp, upstream and downstream, respectively, was also sequenced.

Table 11: Expression of the Human IIEl Gene Promoter Linked to the Luciferase cDNA"

mV/me. expt 1 expt 2 phIIElL (-539) 1.25 2.38 pRSVL 333 250 Supercoiled plasmid DNAs were transfected onto McA-RH7777 cells, and sonicated cell extracts were assayed after 72 h of incubation. Results are the average of two transfection experiments. Values ranged about 20% between duplicate transfection in the same experiment. In each experiment a different stock of cells and a different plasmid preparation was used. Results are expressed as millivolts per milligram of cellular protein. The vector pSVOALA5' produced no detectable luciferase activity, similar to that reported by de Wet et al. ( 1 987).

Comparison of the human and rat introns revealed that several are of similar size but share much less nucleotide similarities than the exons (Table I). Only intron 6 was substantially larger in the human gene. Intron 7 had the highest level of similarity a t 48% whereas intron 3 was only 30% similar. In general the 3' half of the IIEl gene is more similar in rat and human. All intron-exon boundaries were precisely conserved between the two species. Determination of the Transcription Start Site. The transcription start site was determined by both S1 mapping and primer extension (Figure 3). Both procedures produced fragments of identical sizes that corresponded to a G polymerase start site when compared with a sequencing ladder of the fragment used for S1 protection. This start site was located 23 bp downstream from a TATA box (Figure 3). A CCATT box was also detected at -144 bp from the start site. The 5' untranslated region of the m R N A was only 36 nucleotides. Comparison of Rat and Human IIEl Upstream DNA Sequences. Alignment of rat and human upstream DNA revealed 50% overall nucleotide similarity within 1000 bp of the transcription start site (Table I). This amount of similarity is significantly more than the 34% and 37% similarity detected in the first introns and -1000- to -1500-bp region, respectively. More significant similarities were observed within 150 bp of the polymerase start site. Two regions, arbitrarily designated region I and region 11, shared 80% and 94% nucleotide similarities, respectively (Figure 4). The significance of these highly similar segments, however, is still unclear. Analysis of the Human IIEl Upstream DNA by a Promoter Expression Vector. To determine if the DNA immediately upstream of the start site in the human gene was competent

V O L . 27. N O . 25, 1988

H U M A N P4SOIIEI G E N E Sequence

s1 Primer Nuclease Extension I

+G

Table 111: Segregation of the I l E l Gene with Individual Human Chromosomes'

5'

human chrome some

G

I 2 3 4

A

5 6

"', T

Y I

7

E

8 9

Y

10 II

12 13 14 I5 16 17 18 19 20 21 22

C G G

3' G

,.

100 bp

Xmn I

SI probe -primer -----~GAACC -----~CTTGG.32p ------CTTGG.32p

3: Determination of the IlEl mRNA cap site. SI nuclease mapping and primer extension were carried out on human liver poly(A) RNA. The resultant fragments were resolved on a 15% polyacrylamide-8.3 M urea gel. A DNA sequence ladder generated by chemical cleavage of the S I mapping probe was electrophoresed on the same gel. The 5' end of this fragment is coincident with the 5' end of the primer used for primer extension (illustrated at the bottom of the figure). The precise fragment lengths were identified on a separate lower percentage gel and correspond to the G residue demarcated at the left of the figure FG I URE

to drive a promoter-lacking expression vector, it was inserted into pSVOALA5' (de Wet et al., 1987). The human promoter construct, designated h l l E l L (-539), expressed luciferase activity when transfected into the rat hepatoma cell line McA-RH7777 (ATCC CRL 1601) (Table 11); however, the activity was 100-300-fold less than that of pRSVL (de Wet et al., 1987). The low activity of the IIEl promoter reflects the fact that the endogenous IIEl gene is expressed at a level considerably less than that observed in rat liver.' Further experimentation will be required to determine the DNA elements necessary for efficient promoter activity. Chromosome Mapping of the IIEl Gene. The chromosome location of the IlEl gene was determined according to the somatic cell hybrid strategy. Human-rodent hybrid cells

' T. Aoyama. S. Kimura. H. V. Gelbin. and F.J. Gonzaler, unpub lished results.

gene/chromosome

+/+

+/-

-/+

-1-

%discordancy

6 2 5 7 6 8 2 8 3 I2 6 1 2 1 4 7 9 4 7 8 8

6 10 7

15

50

I3

21 30 34

5 6

29 12 21 24 I5 I2 0 16

52 46 39 53 38

5

4 10

4 9 0 6 5 9 5 8 5 3 8 5 4 4 1

~~

19

~~

44

14

53 65 49 50 42 34 36 50 26 32 51

21 40

44 25

23 40 44 26 27 0 29 26 40 41 48 26 55 53 25 32 57

l.d.

51. -

27 -.

35

30

15

20 31 29 I5 39 33

41 47

53 'Thc human IlEl gene was detected as 3.0, 6.0. and 7.0-kbp hybridizing bands in EeoRl digests of somatic cell hybrid DNAs after Southern hybridization with a 1.6-kbpcDNA probe. Detection of the human I l E l gene is correlated with the presence or absence of each human chromosome in the group of somatic cell hybrids. Discordancy represents the presence of the gene i n the absence of the chromosome or the absence of the chromosome despite the presence of the gcne, and the sum of these numbers divided by total hybrids examined ( X 100) represents percent discordancy. The human-hamster hybrids consisted of 18 primary hybrids and I 5 subclones (7 positives of 33 total). and the human-mouse hybrids included I 3 primary clones and 31 subclones ( 5 wsitives of 44 total). Y

Nhe I

9011

h "

h "

containing one or more human chromosomes in a background of rodent chromosomes were analyzed by Southern blotting with the IIEl cDNA probe. Human specific fragments of 7, 6, and 3 kbp were detected in several of the hybrid cell lines (data not shown). Under our hybridization and washing conditions. endogenous hamster IlEl gene sequences were not detected while IO- and 3.2-kbp bands were observed in mouse DNA. The human IlEl sequences were found only in those hybrids containing human chromosome IO. while in hybrids lacking this Chromosome the human-specific fragments were not seen. The results, summarized in Table 111, demonstrate a complete concordance of the presence of the IIEl gene with chromosome IO. Recently, a member of the P450IIC gene subfamily was mapped to chromosome 10 (Okino et al., 1987). In addition, the gene coding for a P-450 involved in steroid biosynthesis (P45OXVII) is located on chromosome IO (Matteson et al., 1986). It appears, therefore, that several P-450 genes exist on this chromosome. The genes, coding for drug- and carcinogen-metabolizing P-45G3, might be involved with or linked to the recently described multiple endocrine neoplasia type 2A (MEN2A) on chromosome IO (Mathew et al., 1987; Simpson et al., 1987). Other genes in the P45011 family, however, are on different chromosomes. For instance, the human IIA gene subfamily is on chromosome 19 (Phillips et al.. 1985). and the IID subfamily is located on chromosome 22 (Gonzalez et al., 1988). cDNA-Directed Exoression of IIEl. Comolementarv DNA expression was performed to determine whether the human lIEl had similar enzymatic characteristics to its extensively studied rodent counterparts. A portion of the first exon of the IlEl gene was joined to a partial IlEl cDNA (Song et al.. ~~

~

901 2

B I O C H E M IST R Y

T

- A.__ bn CA C

TC

U M E N O ET A L .

-ann .__ AGGC

TT

ACCAG AG G CGGTAC G T TC AGAG TGGTCGGCA

GOT

GAG TG

GCCAGATCAGAAGTTAAGTATTTGCAGCCTAGTGCCGTCTTTGGTCTGGAATGAGCGATAAATAGTTGGTAGTGCTTTGC -350

T TT

A AAA TCAb TGGGGATGTG CTG AG T CG T G T A AA C AAAAACCTTTCCCCTACAAACAAACACGTGCAAACATGATCTGAATGCATGCCCACTATGCTGACCAGTGCCAGGGGG-A

-250

T G TAC T

AC TT T A 6 CA CA G G T TG TG AG A GGACAGCTTCCCACCCCCAGGACTGACCTATGAGTTGGTGAGGTATTCCTGCATGTGCTCACCAGCAC--TGACTGGTGC

-150

-200

T

G G C C T A TC C TCAC C G C G CCC GG G CC CCATTG GGG CAGCAACCAGAGTATTGAGAAAGCCCAGCCTCATTCATGTCAGGTCAGTAGATGCAATTTCTAAGAGGTAAAGGCTCCA~

region II

-100

T

AA

G

~ -50

~

C

~

G

~

region I -

AT GAC

C

1

CAA

GGG

C

~

-10 TRARSCRIPTIOtl C ACAG ATT human > A c cc rat ------ > CCCTGATGAGCCACCCTCCTTCTCAAGGATACATTATAAAATCTTTGTT TCTTCAGCA

T

G

G T

------

FIGURE 4: Alignment of rat and human IlEl gene upstream DNA. Nucleotide alignment was performed with the program described by Wilbur and Lipman (1983). The rat sequence on the lower line is continuous and numbcrcd as described in Umeno et al. (1988). Only those bases that differ between human and rat are displayed in the human sequencc in the top line.

Table IV: N-NitrosodimethylamineDemethylase Activities in COS Cells Transfected with p91023(B)-IIEI' specific activity [pmol mi".'

transfected construct

(mg of

protein).']' ND

p91n23(~) p9 lO23( B)-IIEI 8.0 experiment 2 p911123(~) ND p91023(B)-IIEI 8.4 experiment 3 p91023( B) ND p91023(B)-IIEI 8.3 (12.6) 'COS cell microsomes were isolated from cultures transfected with p91023(B) vector alone or p91023(B)-IIEI containing thc coding region of the I l E l cDNA. 'ND. no activity was detected above background. The number in parentheses indicates the specific activity after a 15-min preincubation of microsomes with 1.0 kilounit of rat NADPH-P-450 oxidoreductase at 37 "C prior to addition of substrate. The substrate and protein concentration$were 40 p M and 480-640 pglmL. respectively. Incubations were carried O U I far 20 min. experiment I

FIGURE 5: Expression of the llEl cDNA in COS cells. The IIEl cDNA was inserted into p91023(B) and transfected into COS cells. Production of l l E l protein was monitored by Western blot analysis. Total COS cell protein (100 p g ) and human liver microsomal protein (10 pg) were analyzed for llEl by Western blotting using plyclonal antibody to rat IIEl.

1986). and the cDNA containing the complete amino acid coding sequences was inserted into the expression vector p91023(B) (Wong et al., 1985). Western immunoblot analysis confirmed that the IIEl protein was expressed in COS cells transfected with the IIEl cDNA, and its mobility was identical with that of the IIEl protein detected in human liver microsomes (Figure 5 ) . IIEl was not detected in COS cells transfected with vector alone. Only those cells which were transfected with the IIEl cDNA expressed N-nitrosodimethylamine demethylase activity (Table IV). Addition of purified rat NADPH-P-450 oxidoreductase further enhanced the activity of the IIEl-transfected cells IS-fold. These results conclusively demonstrated that human IIEl possesses nitro-

samine-metabolizing activity, similar to its rodent counterpart. Regislry No. P-45011E1, 117039-30-2:human P-45011EI gene, 117039-31.3; human P-45011EI mRNA. 117039-32-4:cytochrome P-450. 9035-5 1-2: N-nitrosodimethylaminedemethylase,69772-95-8; ethanol, 64-17-5. REFERENCES Black, S., & Coon, M. (1986) in Cytochrome P45@ Strucrure, Mechanism and Biochemisrry (Ortiz de Montellano, P. R.. Ed.) pp 161-216, Plenum, New York. Casazza. J. P., Felver, M. E., & Veech, R. L. (1984) J. Biol. Chem. 259, 231-236. Deininger, P. L. (1983) Anal. Biochem. l29, 216-223. de Wet, J. R., Wood, K. V., De Luca, M., Helinski, D. R., & Subramani, S. (1987) Mol. Cell. Biol. 7 , 725-737. Gonzalez, F. J., Vilbois, F.. Hardwick. J. P., McBride, 0. W., Nebert, D. W., Gelboin, H. V., & Meyer, V. A. (1988) Genomics 2, 174-179. Graham, F. L., & van der Eb. A. J. (1973) J. Virol. 52, 456-467. Hong, J.. Pan, J.. Gonzalez. F. J., Gelboin, H. V.. & Yang. C. S. (1987) Biochem. Biophys. Res. Commun. 142, 1077- 1083.

~

H U M A N P450IIE1 G E N E

Koop, D. R., & Casazza, J , P. (1985) J . Biol. Chem. 260, 13607-13612.

Laemmli, U. K. (1970) Nature (London) 227, 680-685. Luthman, H., & Magnusson, G. (1983) Nucleic Acids Res. 11, 1295-1305.

Mathew, C. G. P., Chin, K. S.,Easton, D. F., Thorpe, K., Caster, C., Liou, G. I., Fong, S.-L., Bridges, C. D. B., Haak, H., Nieuwenhuijzen Kruseman, A. C., Schifter, S., Hansen, H. H., Telenius, H., Telenius-Berg, M., & Ponder, B. A. J. (1987) Nature (London) 328, 527-528. Matteson, K. Y., Picado-Leonard, J., Chung, Bo-C.,Mohandas, T. K., & Miller, W. L. (1986) J . Clin. Endocrinol. Metab. 63, 789-791. Maxam, A. M., & Gilbert, W. (1980) Methods Enzymol. 65, 499-560.

McBride, 0. W., Battey, J., Hollis, G. F., Swan, D. C., Siebenlist, U., & Leder, P. (1982a) Nucleic Acids Res. I O , 8 155-8 170. McBride, 0. W., Hieter, P. A., Hollis, G. F., Swan, D. C., Otey, M. C., & Leder, P. (1982b) J . Exp. Med. 155, 1480-1 490.

McBride, 0. W., Swan, D.C., Santos, E., Barbacid, M., Tronick, S.R., & Aaronson, S. A. (1982~)Nature (London) 300, 773-774. McBride, 0. W., Merry, D., & Givol, D. (1986) Proc. Natl. Acad. Sci. U.S.A. 83, 130-134. Nebert, D. W., & Gonzalez, F. J. (1987) Annu. Rev. Biochem. 56, 945-993. Okino, S. T., Quattrochi, L. C., Pendurthi, U. R., McBride, 0. W., & Tukey, R. H. (1987) J. Biol. Chem. 262, 16072-1 6079.

Parker, B., & Stark, G. (1979) J . Virol. 31, 360-369. Patten, C. J., Ning, S. M., Lu, A. Y. A., & Yang, C. S. (1986) Arch. Biochem. Biophys. 251, 629-638. Reed, K. C., & Mann, D. A. (1985) Nucleic Acids Res. 13, 7207-7221.

Sanger, F., Niklen, S., & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467.

V O L . 27, N O . 2 5 , 1 9 8 8

9013

Sharp, P. A., Bork, A. J., & Berget, S. M. (1980) Methods Enzymol. 65, 750-768. Simpson, N. E., Kidd, K. K., Goodfellow, P. J., McDermid, H., Myers, S., Kidd, J. R., Jackson, C. E., Duncan, A. M. V., Farrer, L. A,, Brasch, K., Castiglione, C., Genel, M., Gertner, J., Greenberg, C. R., Gusella, J. F., Holden, J. J. A., & White, B. N. (1987) Nature (London) 328, 528-530. Sinclair, J., Cornell, N. W., Zaitlin, L., & Hansch, C. (1986) Biochem. Pharmacol. 35, 707-7 10. Song, B.-J., Gelboin, H. V., Park, S.-S., Yang, C. S., & Gonzalez, F. J. (1986) J . Biol. Chem. 261, 16689-16697. Song, B. J., Matsunaga, T., Hardwick, J. P., Park, S.-S., Veech, R. L., Yang, C. S., Gelboin, H. V., & Gonzalez, F. J. (1987) Mol. Endocrinol. I , 542-547. Southern, E. M. (1975) J . Mol. Biol. 98, 503-517. Thomas, P. E., Bandiera, S., Maines, S. L., Ryan, D. E., & Levin, W. (1987) Biochemistry 26, 2280-2289. Towbin, H., Staehlin, T., & Gordon, J. (1979) Proc. Natl. Acad. Sci. U.S.A. 76, 4350-4355. Tu, Y. Y., & Yang, C. S. (1983) Cancer Res. 42, 623-629. Umeno, M., Song, B. J., Kozak, C., Gelboin, H. V., & Gonzalez, F. J. (1988) J . Biol. Chem. 263, 4956-4962. Waxman, D. J. (1986) in Cytochrome P450: Structure, Mechanism, and Biochemistry (Ortiz de Montellano, P. R., Ed.) pp 525-539, Plenum, New York. Wilbur, W. J., & Lipman, D. J. (1983) Proc. Natl. Acad. Sci. U.S.A. 80, 726-730. Wong, G. G., Witek, J. S., Temple, P. A., Wilkens, K. M., Leary, A. C., Luxenberg, D. P., Jones, S. S., Brown, E. L., Kay, R. M., Orr, E. C., Shoemaker, C., Golde, D. W., Kaufman, R. J., Hewick, R. M., Wang, E. A,, & Clark, S. C. (1 985) Science (Washington, D.C.)228, 8 10-8 15. Yang, C. S., Koop, D. R., Wang, T., & Coon, M. J. (1985a) Biochem. Biophys. Res. Commun. 128, 1007-1013. Yang, C. S., Tu, Y. Y., Koop, D. R., & Coon, M. J. (1985b) Cancer Res. 45, 1140-1 145. Yoo, J. S. H., Chung, R., Patten, C., Wade, D., & Yang, C . S. (1987) Cancer Res. 47, 3378-3383.