Structure and expression of the bovine amelogenin gene

Aug 23, 1990 - human amelogenin gene to both the X and Y chromosomes and the mouse amelogenin gene solely to the X chromosome. (Lau et al., 1989)...
2 downloads 0 Views 2MB Size
Biochemistry 1991, 30, 1075-1079 Nyunoya, H., & Lusty, C. J. (1983) Proc. Nutl. Acud. Sci. U.S.A. 80, 4629-4633. Nyunoya, H., Broglie, K. E., Widgren, E. E., & Lusty, C. J. (1985) J . Biol. Chem. 60, 9346-9356. Penefsky, H. S. (1977) J . Biol. Chem. 252, 2891-2899. Pierard, A. (1966) Science 154, 1572-1573. Piette, J., Nyunoya, H., Lusty, C. J., Cunin, R., Weyens, G., Crabeel, M., Charlier, D., Glansdorff, N., & Pibrard, A. (1984) Proc. Natl. Acad. Sci. U.S.A. 81, 4134-4138. Post, L. E., Post, D. J., & Raushel, F. M. (1990) J . Biol. Chem. 265, 7742-7747. Potter, M. D., & Powers-Lee, S . G.(1990) FASEB J . 4, A1834. Powers-Lee, S. G., & Corina, K. (1986) J . Biol. Chem. 261, 15349-1 5352. Powers-Lee, S . G., & Corina, K. (1987) J . Biol. Chem. 262, 9052-9056.

1075

Queen, C., & Korn, L. J. (1984) Nucleic Acids Res. 12, 581-599. Rodriguez-Aparicio, L. B., Guadalajara, A. M., & Rubio, V. (1 989) Biochemistry 28, 3070-3074. Rubino, S. D., Nyunoya, H., & Lusty, C. J. (1987) J . Biol. Chem. 262, 4382-4386. Rubio, V., & Llorente, P. (1982) Biochem. Biophys. Res. Commun. 107, 1400-1405. Rubio, V., Britton, H. G., & Grisolia, S . (1983a) Eur. J . Biochem. 102, 337-343. Rubio, V., Britton, H. G., & Grisolia, S . (1983b) Mol. Cell. Biochem. 53/54, 279-298. Schirmer, T., & Evans, P. R. (1990) Nature 343, 140-145. Souciet, J. L., Nagy, M., Le Gouar, M., Lacroute, F., & Potier, S. (1989) Gene 79, 59-70. Wild, J. R., Loughrey-Chen, S. J., & Corder, T. S. (1989) Proc. Natl. Acad. Sci. U.S.A. 86, 46-50.

Structure and Expression of the Bovine Amelogenin Gene+** Carolyn Gibson,§ Ellis Golub,§ Richard Herold,! Marjorie Riser,! Wendi Ding,! Hitoyata Shimokawa," Marian Young,ll John Termhe,'[ and Joel Rosenbloom*.$ Department of Anatomy and Histology and Research Center in Oral Biology, School of Dental Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 191 04, and Bone Research Branch, National Institute of Dental Research, Bethesda, Maryland 20892 Received June 4, 1990; Revised Manuscript Received August 23, I990

ABSTRACT: In order to define further the mechanisms responsible for tooth amelogenin heterogeneity, seven bovine amelogenin cDNAs were sequenced. On the basis of these sequences, five of the cDNAs could be grouped into one class which differed appreciably in sequence from the second group of two cDNAs. Two overlapping bovine genomic clones were then isolated and shown by sequencing to contain six exons encoding the entire consensus sequence of the class I cDNA. Southern blot analysis of D N A from male and female animals using class I or class I1 specific oligonucleotide probes suggested that the class I gene sequence was located on the X chromosome while the class I1 sequence was located on the Y chromosome. Therefore, these results also suggest that the genes on the X and Y chromosomes are both transcribed. Furthermore, the results are consistent with alternative splicing of the class I primary transcript as a potential mechanism for generating amelogenin heterogeneity.

D u r i n g enamel formation by ameloblasts, the extracellular matrix that is first secreted is relatively high in protein content (30%) but declines during the maturation phase to less than 2% (Frank, 1979; Burgess & McClaren, 1965). Although these proteins appear to be a complex mixture containing up to 20 components, it has been hypothesized that they are derived from a relatively few high molecular weight precursors by programmed degradation or artifactual degradation during sample preparation (Robinson et al., 1982; Sasaki et al., 1982). The proteins obtained from mammalian enamel have been divided operationally into two classes, amelogenins and enamelins, based upon their solubilization with powerful chaotropic solvents such as guanidine hydrochloride (Gu-HC1). 'Supported by National Institutes of Health Grants DE-08239, DE08063, DE-07 126, and DE-09164. $The nucleic acid sequence in this paper has been submitted to GenBank under Accession Number 505307. *To whom correspondence should be addressed at the Research Center in Oral Biology, School of Dental Medicine, University of Pennsylvania, Philadelphia, PA 19104. University of Pennsylvania. National Institute of Dental Research.

Amelogenins are solubilized by extraction with 4 M GusHCI alone, while 4 M GwHCl containing 0.5 M EDTA to dissolve the mineral phase is required to solubilize the enamelins (Termine et al., 1980). The majority of the protein in the amelogenin fraction is found in the molecular weight range of 10K-30K, although small amounts of higher molecular weight components (-40K) have been observed (Belcourt et al., 1983; Christner et al., 1985). In contrast, the majority of the protein in the enamelin fraction has an apparent molecular weight of approximately 70K (Termine et al., 1980; Christner et al., 1985). These two classes of proteins also differ appreciably in amino acid composition (Termine et al., 1980). Furthermore, monoclonal antibodies have been obtained which appear to be specific either for amelogenin or for enamelin, but do not cross-react (Christner et al., 1985; Rosenbloom et al., 1986). Taken together, these lines of evidence support the concept that amelogenins and enamelins are distinct classes of proteins. Mouse (Snead et al., 1985) and bovine (Shimokawa et al., 1987a,b) amelogenin cDNAs have been cloned, and the deduced amino acid sequences agree well, but not precisely, with those previously obtained by extensive protein sequencing of

0006-2960/91/0430- 1075%02.50/0 0 199 1 American Cherpical Society

1076 Biochemistry, Vol. 30, No. 4, 1991 bovine (Takagi et al., 1984) and porcine amelogenin (Fukae et al., 1983; Yamakoshi et al., 1989) and with amino-terminal sequences of amelogenin components of several species including man (Fincham et al., 1983). Recently, Southern analyses of DNA from somatic cell hybrids have localized the human amelogenin gene to both the X and Y chromosomes and the mouse amelogenin gene solely to the X chromosome (Lau et al., 1989). Analysis of bovine DNA has also indicated that amelogenin sequences are located on both the X and Y chromosomes (Nakahori & Nakagome, 1989). Southern analyses of human, mouse, and bovine genomic DNA are consistent with the presence of a single gene per haploid genome (Shimokawa et al., 1987; Lau et al., 1989). It remains to be determined whether both genes on the X and Y chromosomes are functional. This apparent limited copy number of the amelogenin gene restricts the mechanisms whereby the tissue protein heterogeneity may be generated. While physiological processing or artifactual degradation of the primary translation product may certainly occur, another mechanism which should be considered is alternative splicing of the primary transcript. Such alternative splicing has been shown to occur in more than 50 genes (Breitbart et al., 1987) including the extracellular matrix proteins fibronectin (Ruoslahti, 1988) and elastin (Yeh et al., 1989), and there is suggestive evidence from protein sequencing data that the amelogenin transcript may be alternatively spliced (Fincham et al., 1983). Furthermore, at least four distinct amelogenin species were synthesized when hybridselected mRNA was translated in an in vitro system, and two major mRNA species were observed by Northern hybridization (Shimokawa et al., 1987). In order to define further the mechanisms responsible for amelogenin heterogeneity, we have sequenced seven bovine amelogenin cDNAs and isolated and characterized a bovine amelogenin gene which appears to be actively transcribed and located on the X chromosome. However, the results also indicate that there is an actively transcribed amelogenin gene on the Y chromosome and that this gene has diverged appreciably from that on the X chromosome. In addition, the structure of the analyzed gene is consistent with a pattern of alternative splicing suggested by the protein sequencing data. MATERIALS AND METHODS Materials. Materials and their suppliers were as follows: HA nitrocellulose for library screening, Millipore, Bedford, MA; BA85 nitrocellulose for blots, Schleicher & Schuell, Keene, NH; [y-32P]ATP(1 1 1 TBq/mmol), [ C I - ~ ~ S ] ~ A(37 TP TBq/mmol), and [LU-~*P]TTP (1 11 TBq/mmol), Amersham Corp., Arlington Hts, IL; Sequenase DNA sequencing kit, U.S. Biochemicals Corp., Cleveland, OH. Construction and Screening of an Ameloblast cDNA Library. Total RNA was extracted from frozen ameloblast-rich tissue prepared from fetal bovine teeth as previously described (Shimokawa et al., 1987a). Poly(A+) RNA was isolated by affinity chromatography using oligo(dT)-cellulose (type 111, Collaborative Research) and used to synthesize cDNA by the RNase H method (Gubler & Hoffman, 1983). The cDNA was inserted into Xgtl 1 (Vector Cloning Systems, San Diego, CA) using EcoRI linkers, and desired recombinant phage were identified by screening with affinity-purified rabbit antiamelogenin antibodies as previously described in detail (Shimokawa et al., 1987a). Positive plaques were amplified and rescreened to purity as described by Benton and Davis (1977). Isolation and Characterization of the Amelogenin Gene. A bovine library (Cicila et al., 1985) constructed by the in-

Gibson et al. sertion of genomic DNA partially digested with Sau3A into the BamHI site of Charon 30 was screened with a restriction fragment containing nucleotides 1-641 of the class I cDNA (see Figure IA). Two overlapping clones, which were subsequently shown to encompass the entire gene, were identified and isolated by successive rounds of plaque purification. These clones were characterized by restriction endonuclease mapping, and partially sequenced in order to determine the exon/intron structure of the gene and to compare the genomic sequence with that of the cDNAs. DNA Sequencing and Synthesis of Oligonucleotides. Restriction fragments of the cDNA and genomic clones were isolated after electrophoresis on 1% SeaPlaque agarose gels and subcloned into pUC18 or pUC19. The cDNA fragments were sequenced in their entirety while the genomic inserts were partially sequenced by the Sanger dideoxynucleotide chaintermination method (Sanger et al., 1977). Both strands were sequenced to resolve any ambiguities. Oligonucleotide primers (17-26 bases) were synthesized by a modification of the phosphite method on a Milligen Model 7500 DNA synthesizer (Burlington, MA) and purified by high-pressure liquid chromatography. Southern Hybridization. Genomic DNA was purified from male and female bovine white blood cells, digested with restriction endonucleases, size-separated by electrophoresis, and blotted onto nitrocellulose membranes. Oligonucleotides identical in sequence to 25 bases in the 3’ ends of class I (STAAAAGATCAGAAAATGAGAAGAGA3’) and class I1 (fTAAAAGAAAATGAGAGAACAAAACC3’) cDNAs were synthesized, radiolabeled with [y-32P]ATP,and used as probes under relatively stringent conditions. Hybridization was in 6X standard saline citrate/5X Denhardt’s solution/O.1% sodium dodecyl sulfate/50 hg/mL salmon sperm DNA at 55 “C. The final wash at 60 “ C contained 6X standard saline citrate/O. 1% sodium dodecyl sulfate. The blots were exposed to X-ray film for 5-7 days at -70 OC. Following autoradiography, the blot was incubated in H 2 0 at 70 OC for 30 min to elute the probe, prior to rehybridization. RESULTSAND DISCUSSION cDNA Analyses. In order to determine whether any heterogeneity existed in the bovine amelogenin cDNAs, seven clones which differed on the basis of initial size determination were selected for analysis. We wished to sample both full length (-850 bp) and also shorter cDNAs (-350-450 bp) which might represent the products of alternative splicing. On the basis of DNA sequence determination, the cDNAs fell into two classes. The first class (Figure IA), containing five of the clones ranging in size from 340 to 825 bp, corresponded to the bovine sequence previously reported (Shimokawa et al., 1987a,b) except for several discrepancies. A group of these was located at nucleotide positions 13, 35, 36, 37, and 50. The effect of these differences was to change the starting sequence of the signal peptide from MPKWGP to MGT. Nucleotide 488 was found to be A instead of C, resulting in an amino acid change of Leu to Met. Finally, there were several changes in the 3’-untranslated region which resulted in the apparent loss of two consensus polyadenylation signals, leaving only one remaining. These differences may be due to allelic variation, cloning artifacts, or previous sequencing errors. It is unlikely that sequencing errors occurred in the present work since both strands of the clones were sequenced and the consensus cDNA sequence agreed with the presumptive exon sequences found in the cloned gene (see below). The deduced amino acid sequence agreed well with that determined by protein sequencing of bovine amelogenin except for several previously

Biochemistry, Vol. 30, No. 4, 1991

Characterization of the Bovine Amelogenin Gene

e

0

4

10

I

I)

bp (x1001

40

1 ATO Ye1

41

-18

88 1 149 2 1

208 41

208 81

oao

ACC Thr

Oly

TOO A T T T r p II.

TTO L.u

TTT Ph.

occ Ala

A T O e C m A 3 C C A M.I Pro Lou Pro

CCT C A T CCT 000 CAC CCT 0 0 1 T A T Pro HI. Pro O l y n l s Pro a l y Tyr

ACC C C T C T O A A O T h r P r o L.u Ly.

TOO Tip

ccc

ATO P r o Y.1

T I C CAO A O C A T 0 A T A Y.1 11. T y r Oln 8.r

O O T O O A r o o C T O C A C C A C CIA Oln O l y 011 T r p L * Y n l . H I .

A A T C A C occ C T O cAa C C T I a n n l s AI. L*u Qln Pro

CAT

nl.

AOA Arg

ATC ATT II. 11.

TOC C T C C T O O O A OCA occ T T C T C T Cy# L e u L * u O l y A l a A l a Pho 8 o r A T C AAC 11. Asn

TTC A 0 0 P h o 8.r

C A C 4 C m C 6 C C T nlm P r o T y r P r o

ooc

TCC T A T Tyr

108

ccc

CAO Oln

tea

OTO OTC Vml Vml

$28 I O

OCA A C C C A A C A C C A C P r o T h r OIn U l s Ill.

$Ia

cco cAa

448 120

CAO CAO Oln Oln

Pro

Oln

$88 101

CAO

Oln

CCA A A C C T C COT C T O P r o L*u P r o A o n L.u

ccc occ

448

111

CCT pro

C A C C A O ccc C T O C A O HI. oin pro L*U 01"

C C T C A C C A O ccc C T O C A O ccc A T O C A O ccc A T O p r o HI. o ~ n p r o L.U 0 1 " p r o M.I ai" p r o Yet

608

cAa ccc c i a

CAO

ccc

141

ain

pro

L.U

6.8 101

cAa

CCA Pro

C C T C T O CCT Pro Lou Pro

028 181

CCT P ~

O L.U

080 789

8..0

I 1: e:

Oln

o ~ np r o

.... .. :.. cia

OAA OIU

I

1

*:

OCT AI. I

CCT Pro

A T a A T O CCA O T T M.1 Pro Val Y.1

TOO r r p

Pro

Oly

CAO CAO Oln Oln

AI.

cAa ccc C A S L.U a i " p r o ain cco A T A T T C ccc

CTO

I I o

Pho

CCA OCA pro AI.

ACA ~ h

OAC A r SP

Pro

.. ..

.001g..gl...t.ct10.gllgOl110. I 10 1g 1 I

.*

I ..go.

Pro

CAC TCC A T O W.1 U l o 0.r

ccc

TIC cAa ccc C A O T C C A T C CAO II. Oln P h o Oln P r o Oln 8.1

Pro

ACT Thr

ccc

CIA Oln

OCA CCT pro pro

ATO P r o Mol

BOT Oly

O T O O T O TCC C A O CAO A C T O l n O l n Thr P r o V a l V a l 0.r

CAO

C A A CCA Oln Pro

O T O C A C ccc A T C V ~ W I I. p r o 11.

CAB CCT Oln P r o

ACC AAO ~ y . ~ h ~r y

. . o h I . . g

ccc

Pro

Pro OIn

cAa ccc

oln

pro

TTO

cco

TTO L*Y

Pro

L.u

~*

r

80

100

608 140

6.8 180

CiTa C T T C O T O A C C T O P r o M a l Lou P r o Asp Lou

818 180

CAO

01n

pro

L

~

c r o ccc ccc

coo

40

CCA p Uro Pro

... .. ..

**a

- 1 148 10

TAC OAA T Y I 0111

8.r

ccc

C A C C A C A T C ccc Ara O T O C C A O C T nlm n l . I I a P r o M.1 V a l P r o A l a

88

T A T 3 0 ~ 0 4 C T C Tyr O I u Val Leu

ccc

at8 81

1077

OAO a** Og I U O I U

54-7-+6

OTO OAT v.1 A.P

a a n a t ~ a e89 187

TAA

Cna

g.0.0..g..0.0...g.t~lllloOt.Oo.O II 1

768 816

P

e

0

a8*

IO1 449 111

bp ( ~ 1 0 0 )

e

10

I

C I 0 C A O OCA A A C C T C COT C T O CCC OCC CAO C A O T C C 110 C A O C C C C A O C C C A T C CAO O C A CAO Oln O l n 8.r P h o a l n P r o O l n P r o 1 1 . O l n P r o Oln Pro AI. Oln pro ~ s n Leu p r o L ~ U ICROI 18111 I

448 110

0 C O T CAC C A A P r o n l a Oln C 0 COO C T O C A O CCC CAO O C A C O T 010 CAC C C C A T C CAO COC 110 CCA OCA P r o L o u Oln P r o Oln P r o P r o V a l U l e P r o I I o O l n A r p L e u P r o P r o

609 141 688 161

4

CAO OCA COT Oln Pro Pro

0

cia

COT CCA A T A Leu P r o Pro 11.

..

I m.0 I C A 0 T T C coo A T O CAI C C T C T O ccc C C T O T O C T T CCT OAC C T C Pho P r o Ye1 O l n P r o L o u P r o P r o V a l L n u P r o A * P L * u IYETI

a

688 180 818 180

n

C C A O C A A C A O A C A A O ACC A A O COO OA; O A A 010 O A T Pro AI. T h r A.P L y a T h r L y s A r g O l u O l u V.1 Asp

181

C C T C T O O A A OCA P r o L e u 0111 A I .

890 789

1g.fl.n..c....Oc1gflt.1tIO.g01gllOll.gg..1g.O.c.....*gO**l*olt1glfl000*1..10.l1ot. 01.gt...11Olgl..O1..**g1*Ol*l1.*t~*00..11....1.0gl.Og..l~.....

819

TOO Trp

TAA

lna

.a0.#**

889 181

788 89 1

FIGURE 1: Consensus sequences of the two classes of cDNA. (A) Class I contained five clones whose sizes are diagrammed. (B) Class 11 contained two clones. Partial restriction endonuclease maps are illustrated (K, KpnI; N, NcoI; P, PstI; H, HindIII). In the class I sequence, the demarcation into exons is indicated ( v ) The . single consensus polyadenylation signal sequence, aataaa, is boxed. The 13 individual specific nucleotide differences in the coding region between the sequences of the 2 classes and any resultant amino acid changes are designated in the class I 1 sequence. The nucleotide and amino acid numbering in the class 11 sequence as well as the position of the apparent deletion have been chosen to maximize homology between the class I and class I1 sequences. There was comparatively little homology between the two sequences in the 3'-untranslated regions. The 25-base oligonucleotides used for the Southern blotting experiments illustrated in Figure 4 are underlined.

noted discrepancies (Shimokawa et al., 1987; Takagi et al., 1984). The second class of cDNAs was represented by two clones 324 and 384 bp in length (Figure lB), and although they undoubtedly do not include the entire cDNA sequence, they nevertheless exhibit differences from the first class in several important respects: (1) There were 13 individual nucleotide differences in the coding region as indicated in Figure 1 B. Significantly, four of these changes resulted in

amino acid differences between the two classes, while the other nine were in the third or wobble position of the codon and conserved the amino acid. (2) There was a deletion in the second class of 63 bases, corresponding to the deletion of 21 amino acids in the first class (within residues 123-147; the precise position cannot be unequivocally defined because of the repetitive nature of the sequence). (3) The nucleotide sequences of the 3'-untranslated segment of the two classes

Gibson et al.

1078 Biochemistry, Vol. 30, No. 4. 1991

-

c

-

cC

*c c -c

t

=%Z "I I

* + *- *+

"

I"

I

e

I

I

I

1

Exon I

I1

c

E"

E "

, ,

1%

4 5

3

2

- -6

< I

I

0

Baml

I

Bam2

>

I I

I

,

I

I

2

I

,

,

,

8

6

,

10

kbp FIGURE 2 Diagram of the bovine amelogenin gene. The entire gene is contained in two overlapping clones, as illustrated. Clone BAMl extends 10.3 kbp 5'of the region illustrated. and clone BAM2 extends 4.7 kbp 3' of the region. Restriction fragments, cloned into the plasmid pUC 19. were sequenced as indicated (-) by the dideoxy methal using either universal primers or amelogenin-specificoligonucleotides (Sanger et 81.. 1977). Map distances have been estimated by the size of restriction fragments, and the positions of exons I and 3 are only approximate since the contiguous introns have not been sequenced in their entirety. elon

1ntrsn 3 '

5'

i"IlO"

1

AG

auTaTacAi

rGTmTcAa

aa

I

CT

GTOAOTAAAA

CCTATAAAAO

CT

3

A0

PTAATTTTTC

TTCTCTTAAD

01

4

ca

aTiTaTmT

TTcTciccia

TA

5

ia

aiaxmmc

TTTTTTCCAO

G I

6

TERM

FIGURE 3: Intron sequences flanking exons of the bovine amelogenin gene. All borders conform to the consensus sequence AG-exon-GT for eukaryotic genes (Breathnach & Chambon. 1981).

differed significantly. As discussed further below, these significant differences are too extensive to be accounted for by the existence of a single primary transcript, and they imply that two different alleles are being transcribed and that the transcripts were represented in the analyzed cDNAs. Structure of the Gene. In order to clarify the origin of the cDNAs, a bovine genomic library contained in A phage Charon 30 was screened with a 641 bp restriction fragment isolated from clone cBAMl (class I) described above. Two overlapping genomic clones, encompassing approximately 25 kbp, were identified and characterized by restriction endonuclease mapping and DNA sequencing (Figure 2). These analyses demonstrated that the entire consensus sequence for the class I cDNA was contained within the cloned genomic DNA, and permitted the definition of the exon/intron structure of that portion of the gene encoding the cDNA. This portion of the gene is composed of six exons, and exon/intron borders conform to the consensus sequences deduced from analysis of a large number of eukaryotic genes (Figure 3) (Breathnach & Chambon. 1981). There was precise agreement between the sequence contained in the designated exons and the corresponding cDNA. The exon designated 1 consists solely of a 5'-untranslated segment, while exon 2 includes the remainder of the 5'-untranslated sequence, the entire signal sequence, and the first two amino acids of the mature protein. Exons 3 and 4 are relatively small exons, collectively encoding 31 amino acid residues. It should be noted that exon 4 terminates with amino acid 33 (numbered from the first residue of the protein minus the signal sequence), the precise amino acid beyond which the sequences of two amelogenin components have been shown to diverge (see below). The bulk of the amelogenin protein is encoded in exon 5 which contains 489 nucleotides encoding amino acid residues 34-196. Exon 6 encodes the final amino acid residue and the 3'-untranslated segment. One

potential, canonical polyadenylation signal is located within this region. Because of the differences in sequence, the class I I cDNA cannot result from the transcription of this gene. In a preliminary report, exons very similar to bovine exons 3-5 were found in a partially characterized human amelogenin genomic clone (Shimokawa et al., 1989). We have designated the 5'-untranslated leader sequence as being included in exon I, and it should be noted that the cloned DNA includes approximately 11.5 kbp upstream of this 5'untranslated sequence. Preliminary analyses (data not shown) of several hundred base pairs adjacent to this untranslated segment indicate that this segment includes features characteristic of a promoter, including the presence of an appropriately placed TATAA box. However, further analyses, including nuclease protection, primer extension, and functional assays, are necessary to demonstrate that this region functions as a promoter, and that another intervening segment is not present. Possibilify ofAlternutive Splicing. The above results offer a possible explanation for the existence of a leucine-rich amelogenin peptide (LRAP), isolated from developing bovine enamel a n d shown to have t h e sequence

MPLPPHPGHPGYINFSYEVLTPLKWYQSMIRHPPLPPMLPDLPLEA (Fincham et al., 1983). This sequence begins with the first 33 residues encoded in the consensus sequence for the class I cDNAs and encoded in exons 2 4 of the characterized gene. However, after residue 33, this sequence diverges from that immediately following in the cDNA or found in a large, sequenced amelogenin protein component (Takagi et al., 1984). Significantly, the following 13 residues are identical with residues 172-184 encoded in nucleotides 602-640 of the class I cDNA. The results suggest that the CAG codon for glutamine-171 acts as an alternative splice acceptor site. The DNA sequence immediately 5' of codon 171 is pyrimidine-rich, consistent with that of an alternative acceptor site (Ruskin et al., 1984). Although no sequence conforming exactly to the branch-point consensus sequence, PyNPyTPuAPy (Ruskin et al., 1984). could be identified near the pyrimidine-rich sequence, one potential branch point, deviating at one position from the consensus sequence, was identified (Le., nucleotides 582-588). Since there are a number of glutamine CAG codons in the vicinity with similar surrounding features, it is possible that they are also used as acceptor sites. However, further experiments, such as the isolation of appropriate cDNAs, are required to prove that any of these sites, including the CAG of codon 171, do in fact act as acceptors. Consideration of the sequence of the class I1 cDNAs also suggest that the deletion which appears to have occurred relative to the class I sequence might also be explained by alternative splicing. However, for this lo be possible, a second gene which contains a functional splice donor site within the present exon 5 would be necessary. Southern Analyses. The observation of two distinct classes of cDNA sequences is consistent with the possibility that amelogenin sequences are located on both the X and Y bovine chromosomes (Nakahori & Nakagome, 1989). In order to substantiate the existence of two such genes which may have diverged, oligonucleotides specific for the 3'-untranslated segments of each of the two cDNA classes were synthesized and used to probe genomic blots containing male and female bovine DNA. Relatively stringent hybridization and washing conditions were employed to eliminate potential cross-hybridization (Figure 4). The results of these analyses were interesting and clearly demonstrated that the DNA from the steer hybridized to both probes while the DNA from the cow

Characterization of the Bovine Amelogenin Gene I 2 3 4 5

Biochemistry, Vol. 30, No. 4. 1991 1079

1 2 3 4 5 23.1 ‘9.4

-23.1 -9.4

6.6

-6.6

4.4

-4.4

2.3 20

-2.3

‘0.6

-0.6

\

T

20

blot analysis of male and female bovine genomic DNA, using oligonucleotide probes specific for class I or class I I cDNA. ( A ) Blots were hybridized first with the oligonucleotide STAAAAGAAAATGAGAGAACAAAACC3’ mnhined in the class IIcDNA 3”Itranslated region. Lane I , male DNA restricted with EcoRI: lane 2. male DNA restricted with Himllll;lane 3, blank; lane 4, female DNA restricted with EcoRI: lane 5, female DNA restricted with Hindlll. (B) After being washed and autoradiographed, the blots wcrc stripped and then hybridized to the oligonucleotide S‘TAAAAGATCAGAAAATGAGAAGAGAY contained in the 3’-untranslated region of the class I cDNA. Lane designations as in (A). The slight difference in the mobilities of the fragments in lanes 2 and 5 is probably due to a variation in salt content in the samples, since no difference was seen in other similar blots. FIGURE4 Southern

hybridized only to the class I probe. The sizes of the restriction fragments observed with the class 1 probe are as predicted by the cloned genomic DNA, while the class I1 probe hybridized to a different size band in EcoRI-digested male DNA. Therefore, these results tentatively localize the cloned gene described in this paper and which corresponds to the class I cDNA to the X chromosome and the class II cDNA to a transcribed gene located on the Y chromosome. It appears, therefore, that the genes on both the X and Y chromosomes are actively transcribed. It will obviously be of interest to confirm that the presently described gene is in fact on the X chromosome and to isolate and characterize the gene on the Y chromosome in order to compare its organization. It will also be of interest to determine whether amelogenin protein components having an amino acid sequence corresponding to the class I1 transcripts exist and whether there is developmental control of the alternative splicing and/or differential expression of the X and Y genes. Since amelogenin sequences have been localized to the human X and Y chromosomes (Lau et al., 1989), a similar situation may exist in man, and this possibility should be explored in the context of hereditary disease affecting enamel, such as amelogenesis imperfecta (Witkop & Rao, 1971). ACKNOWLEDGMENTS We thank Dr. Dan Capon of Genentech Corp. for the generous gift of the bovine genomic library, Ms. Mary Dixon for her excellent technical assistance, and Dr. Edward Lally for his helpful advice. REFERENCES Belcourt, A. B., Fincham, A. G., & Termine, J. D. (1983) Calcif. Tissue Int. 35, 1 11.

Benton, W. D., & Davis, R. W. (1977) Science 196, 180. Breathnach, R., & Chambon, P. (1981) Annu. Rev. Biochem. 50, 349. Breitbart, R. E., Andreadis, A,, & Nadal-Ginard, B. (1987) Annu. Reo. Biochem. 56,467. Burgess, R. C., & McClaren, C. M. (1965) in Tooth enamel, its composition, properties, and fundamental structure (Stack, M. V., & Fearnhead, R. W., Eds.) Vol. I, p 74, Wright, Bristol, U.K. Christner, P. J., Lally, E. T., Miller, R. D., Leontzwich, P., Rosenbloom, J., & Herold, R.C. (1985) Arch. Oral Biol. 30, 849. Cicila, G., May, M., Ornstein-Goldstein, N., Indik. Z., Morrow, S., Yeh, H. S., Rosenbloom, J. C., Boyd, C., Rosenbloom, J., & Yoon, K. (1985) Biochemistry 24,3075. Fincham, A. G., Belcourt, A. B., Termine, J. D., Butler, W. T., & Cothran, W. C. (1983) Biochem. J . 211, 149. Frank, R. M. (1979) J . Dent. Res. 58, Special Issue B, 684. Fukae, M., Tanabe, T.,& Shimizu, M. (1983) Jpn. J. Oral Biol. 25 (Suppl.), 29. Gubler, U., & Hoffman, B. J. (1983) Gene 25, 263. Lau, E. C., Mohandas, T. D., Shapiro, L. J., Slavkin, H. C., & Snead, M. L. (1989) Genomics 4 , 162. Nakahori, Y., & Nakagome, Y. (1989) Cytogenet. Cell Genet. 51, 1050. Robinson, C., Kirkham, J., Briggs, H.D., & Atkinson, P. J. (1982) J . Dent. Res. 61, 1490. Rosenbloom, J., Lally, E. T., Dixon, M., Spencer, A., & Herold, R. (1986) Calcif. Tissue Int. 39, 412. Ruoslahti, E. (1988) Annu. Reo. Biochem. 57, 375. Ruskin, B., Krainer, A. R., Maniatis, T., & Greene, M. R. (1984) Cell 38, 317. Sanger, F., Nicklen, S., & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 5463. Sasaki, S., Shimokawa, H., & Tanaka, K. (1982) J. Dent. Res. 61, 1479. Shimokawa, H., Sobel, M. E., Sasaki, M., Termine, J. D., & Young, M. F. (1987a) J . Biol. Chem. 262, 4042. Shimokawa, H., Ogata, Y., Sasaki, S., Sobel, M. E., Mcquillan, C. E., Termine, J. D., & Young, M. F. (1987b) Ado. Dent. Res. I, 293. Shimokawa, H., Tamura, M., Ibaraki, K., Ogata, Y., & Sasaki, S. (1989) in Tooth Enamel V (Fearnhead, R.w., Ed.) p 301, Florence Publishers, Yokohama, Japan. Snead, M. L., Lau, E. C., Zeichner-David, M., & Fincham, A. G. (1985) Biochem. Biophys. Res. Commun. 129,812. Takagi, T., Suzuki, M., Baba, T., Minegishi, K., & Sasaki, S. (1984) Biochem. Biophys. Res. Commun. 121, 592. Termine, J. D., Belcourt, A. B., Christner, P. J., Conn, K. M., & Nylen, M. V. (1980) J. Biol. Chem. 225, 9760. Witkop, C. J., & Rao, S. (1971) Birth Defects 7, 153. Yamakoshi, Y.,Tanabe, T.,Fukae, M., & Shimizu, M. (1989) in Tooth Enamel V(Feamhead, R. W., Ed.) p 314, Florence Publishers, Yokohama, Japan. Yeh, H., Anderson, N., Ornstein-Goldstein, N., Bashir, M. M., Rosenbloom, J. C., Abrams, W., Indik, Z., Yoon, K., Parks, W., Mecham, R., & Rosenbloom, J. (1989) Biochemistry 28, 2365.