Tissue-specific expression of mouse α-amylase genes: Nucleotide sequence of isoenzyme mRNAs from...

Post on 27-Dec-2016

212 views 0 download

Transcript of Tissue-specific expression of mouse α-amylase genes: Nucleotide sequence of isoenzyme mRNAs from...

Cell, Vol. 21, 179-l 87. August 1980, Copyright 0 1980 by MIT

Tissue-Specific Expression of Mouse a-Amylase Genes: Nucleotide Sequence of lsoenzyme mRNAs from Pancreas and Salivary Gland

Otto Hagenbiichle, Raymonde Bovey and Richard A. Young Swiss Institute for Experimental Cancer Research 1066 Epalinges Lausanne, Switzerland

Summary

We have determined the nucleotide sequence of two different mouse cr-amylase mRNAs, one found in the pancreas and the other in the salivary gland. The 1577 and 1659 nucleotide mRNAs from pan- creas and salivary gland, respectively, are the major cu-amylase species which accumulate in each tis- sue. Differences in mRNA length are primarily in the 5’ noncoding regions. Comparable portions of the mRNAs are 89% homologous. The mRNA se- quences predict cr-amylase precursor proteins of 508 and 511 amino acid residues, accounting for nearly the entire coding capacity of the mRNAs; differences in protein length occur as a result of a nine nucleotide segment present within the trans- lated portion of salivary gland, but not pancreas, mRNA. The lengths and amino acid compositions of the predicted proteins concur with those deter- mined empirically by others. These proteins differ 12% in amino acid sequence, explaining previously observed differences in net charge and antigenic properties. Finally, translation of the salivary gland a-amylase mRNA is not initiated at the AUG codon nearest the 5’ terminus since that codon is almost immediately followed by the termination triplet UAA. This observation may have implications for the mechanism of translation initiation in eucaroytes.

Introduction

cu-Amylase isoenzymes are found in several rodent tissues, where they are responsible for the hydrolysis of (Y-I ,4 glycosidic bonds. As isolated from rat pan- creas and salivary gland, the 55,000 dalton a-amy- lase polypeptides differ tissue-specifically with re- spect to antigenic properties and net charge (Sick and Nielsen, 1964; Hammerton and Messer, 1971; Sanders and Rutter, 1972; Kaplan, Chapman and Ruddle, 1973; Takeuchi, Matsushima and Sugimura, 1975). In the mouse, cY-amylase mRNAs accumulate to different levels in pancreas and salivary glands (Schibler et al., 1980). indicating that cu-amylase bio- synthesis is differentially regulated in the two tissues.

Recently, our colleagues reported the isolation of cDNA clones made against cr-amylase mRNA from mouse pancreas and salivary gland (Schibler et al., 1980). They found that pancreas cDNA-mRNA hy- brids exhibit a different melting temperature than the heterologous hybrid of pancreas cDNA with salivary

gland mRNA, suggesting that different cr-amylase genes are expressed in these tissues. To investigate further the tissue-specific expression of cy-amylase genes, our initial step was to determine the complete primary structure of their mRNAs from the pancreas and salivary gland. In this paper we present the first comparison of isoenzyme mRNA and protein se- quences.

Results and Discussion

Sequence Analysis of cr-Amylase mRNAs To determine the sequence of a-amylase mRNAs from mouse pancreas and salivary gland tissues, we used the following strategy. First, cloned cr-amylase cDNA was sequenced using the Maxam and Gilbert (1977) technique. The 5’ terminal mRNA residues were then elucidated by sequencing cDNA generated by reverse transcriptase-catalyzed primer extension, and by direct RNA sequencing of chemically de- capped mRNA.

Recombinant DNA clones pMPa21 and pMSa104 contain DNA copies of cy-amylase mRNA from mouse pancreas and salivary gland, respectively, inserted into the Pst I site of pBR322 (Schibler et al., 1980). Restriction endonuclease cleavage sites in the cDNA inserts of pMPa21 and pMSa104 were mapped for the enzymes Alu I, Eco RI, Hae Ill, Hha I, Hinf I and Pst I (Figure 1). DNA restriction fragments were se- quenced according to the plan outlined in Figure 1, generating complete data for each of the two strands in both cDNA clones.

To examine the 5’ terminal portions of the mRNAs, one approach was to synthesize a DNA copy of the 5’ end using a 5’ terminally labeled cDNA primer. The origin of the 67 nucleotide primer we isolated is shown in Figure 1. After hybridization of primer and mRNA from pancreas or salivery gland, cDNA was synthe- sized using AMV reverse transcriptase, and the prod- ucts were run on a polyacrylamide urea gel (Figure 2A). In addition to a band of primer DNA, a single band of relatively low mobility appeared in the gel (lengthy exposure of this gel revealed no additional bands of lower mobility). The 1 19 and 195 nucleotide cDNAs generated against mRNA from pancreas and salivary gland, respectively, were subjected to se- quence analysis (Figure 28). A portion of the DNA sequence deduced in this manner overlaps that found in the cDNA insert of the recombinant plasmids (Figure 5: residues 96-l 36 for pancreas, 68-85 for salivary gland). Although the sequence of primed cDNA made from salivary gland cr-amylase mRNA was unambig- uous using the Maxam and Gilbert technique, that obtained from pancreas mRNA was not entirely inter- pretable since bands appeared in every lane in some positions of the sequencing gel in several independent experiments (Figure 2B). Experiments described be-

Cell 180

Figure 1. Restriction Map of Cloned cDNA and Sequencing Strategy

pMPa21 (P: pancreas) and pMSal04 (S: salivary) inserts were mapped for restriction endonuclease cleavage sites as described by Smith and Birnstiel (1976). A suitable fragment was isolated by labeling the Hind III cleavage site in the pSR322 vector of the recombinant plasmid DNA, digesting with Bgl I and separating the two DNA products on a 1.4% agarose gel. The arrows indicate the origin, direction and extent of sequence determination. The thick arrow designates the 67 nucleotide primer used for reverse transcriptase-catalyzed cDNA synthesis. All endonuclease cleavage sites were confirmed by the sequence we elucidated. The open boxes represent the GC homopolymer tracts introduced during cloning. Distances are in kilobases (kb) from the 5’ end of salivary gland o-amylase mRNA.

low indicated that pancreas a-amylase mRNA heter- ogeneity was not responsible for the apparent ambi- guity in sequence; rather, we attribute the ambiguity to technical problems.

We also examined the 5’ terminal nucleotides by labeling the mRNAs directly. Since most eucaroytic mRNAs contain a 5’ terminal cap of the structure m7GpppXmpY(m)pZ, they cannot be labeled directly using T4 polynucleotide kinase. We assumed that this was also the case for the cY-amylase mRNAs. After chemical removal of the cap, the mRNA was 5’ end- labeled and purified on hydroxymethylmercury aga- rose gels and oligo(dT) columns (see Experimental Procedures). The 5’ terminal nucleotide of pancreas a-amylase mRNA was determined to be pGm by two- dimensional chromatography of nuclease Pl limit digestion products (Figure 3A). Partial Pl digestion of the pancreas a-amylase mRNA and subsequent anal- ysis of the “wandering spot” patterns revealed that the sequence of the initial four nucleotides is pGmACA (Figure 36). The 25 5’ terminal nucleotides of this mRNA were also determined by rapid RNA sequenc- ing (Figure 4); the sequences we elucidated with this technique indicate that the ambiguity encountered in Maxam and Gilbert sequence analysis of the primer- extended pancreatic cr-amylase cDNA is not due to sequence heterogeneity. Furthermore, sequence analysis of cloned pancreatic type a-amylase ge- nomic DNA confirmed the pancreas mRNA 5’ end sequence (R. A. Young and 0. Hagenbiichle, unpub- lished data). Since the initial 30 nucleotides of pan- creas a-amylase mRNA are potentially involved in hairpin loop secondary structure, the formation of such a structure might cause anomalous mobility in the sequencing gels.

Due to its lower abundance relative to the pan-

creatic species (2% versus 20% of polyadenylated RNA; Schibler et al., 1980) size-fractionated salivary gland a-amylase mRNA was insufficiently pure for direct sequence analysis. Therefore, after purification on hydroxymethylmercury agarose gels and oligo(dT) columns, we enriched for a-amylase mRNA se- quences by digesting mRNA-DNA hybrids with RNAase H, using an internal Eco RI restriction frag- ment generated from pMSa104 DNA (residues 371- 902, Figure 5). The digestion products were’subjected to electrophoresis on a methylmercury-hydroxide aga- rose gel to reveal two bands, one composed of full- length a-amylase-sized RNA, the other containing a smaller species of approximately 350 nucleotides. Since the mRNA sequence determined by primer ex- tension predicts an eight nucleotide 5’ RNAase Tl product, the 5’ terminus was purified further by di- gesting the 350 nucleotide RNA with RNAase Tl and subjecting the products to electrophoresis on DEAE paper at pH 3.5 (Sarrell, 1971). More than 90% of the labeled material remained at the origin, as ex- pected for an octameric oligonucleotide, and was analyzed as described above for pancreatic a-amy- lase mRNA. Figures 3C and 3D) show that the 5’ terminal nucleotide of salivary gland a-amylase mRNA is pAm and that the initial five nucleotides are pAm- CUCA. These data demonstrate that primed cDNA synthesis extended to the 5’ terminal nucleotide in both cu-amylase mRNAs. The entire sequence of mouse salivary gland and pancreatic cY-amylase mRNAs is presented in Figure 5.

General Features of the a-Amylase mRNAs Using a combination of cDNA and direct mRNA se- quence analysis, we have determined the 1577 and 1659 nucleotide residues of a-amylase mRNA from

Mouse a-Amylase mRNAs 181

A

B P

C+T C G+A G

S erCG+AG

mouse pancreas and salivary gland, respectively (Fig- ure 5). These two mRNAs consist of 43% G and C residues, exhibit 11% sequence heterogeneity and show the greatest length differences in their 5’ termini. Translation of the cr-amylase precursor can occur from only one coding frame in both mRNAs (residues 94-1626, Figure 5); this frame predicts a precursor protein whose characteristics mirror those determined for cy-amylases produced in vitro and in vivo (see below). To facilitate a more detailed discussion of various features of these mRNAs and their translation products, we shall consider the 5’ noncoding, coding and 3’ noncoding regions of the mRNAs separately.

The 5’ Noncoding Regions While the length of these regions is very different in the two mRNAs we studied (17 and 93 nucleotides in pancreas and salivary gland, respectively), a consid- erable degree of sequence homogeneity (82%) exists between comparable segments of the 5’ noncoding region (Figure 5). Examination of the available 5’ nontranslated sequences of other eucaryotic cellular mRNAs reveals strict conservation of only the cap moiety and the AUG codon (Kozak, 1978). The degree

Figure 2. Synthesis and Sequence Analysis of cDNA Made against a-Amylase mRNA 5’ Termini

(A) cDNA synthesized in the presence of pan- creas (P) or salivary (S) polyadenylated RNA was subjected to electrophoresis on 8% polyacrylamide-7 M urea gels (see Experi- mental Procedures). The sizes of the primer (67) and the cDNAs (119, 195) are indicated in nucleotides. The origin of the primer frag- ment is described in the legend to Figure 1. The electrophoretic origin is marked 0. (6) Maxam and Gilbert sequence analysis of cDNAs made against pancreas(P) and salivary (S) a-amylase mRNA (shown in A). X indicates the last one or two 3’ terminal cDNA nucleo- tides whose elucidation was not possible by this technique. The four nucleotides in paren- theses could not be unambiguously deter- mined from this gel. The first residues we determined are numbered according to their position in Figure 5.

of sequence conservation which occurs in this region in the two a-amylase mRNAs probably reflects a common evolutionary origin; however, the lack of such conservation in the 3’ noncoding regions (discussed below) suggests that the 5’ noncoding cr-amylase mRNA sequences are homologous for some still un- known functional reason. Since sequence comple- mentarity to the 3’ end of 18s rRNA (Hagenbuchle et al., 1978) is not compelling, the 5’ noncoding nucleo- tides are apparently not conserved as a result of pressure to maintain a strong base pairing interaction with that rRNA.

A particularly interesting feature of the 5’ noncoding region of salivary gland cY-amylase mRNA is that the AUG codon closest to the 5’ terminus (residues 46- 48, Figure 5) cannot be used to initiate translation of the cr-amylase precursor protein. This AUG is followed by a lysine codon, then the termination triplet UAA. Fourteen codons further down the mRNA lies another AUG codon, in phase. Because both predicted pro- teins terminate at the same position in the RNAs (residue 1627, Figure 5) and the in vitro translation products of both pancreas and salivary gland mRNAs co-migrate with 57,500 dalton marker proteins on

Cell 182

1 I I

2 I

2

Figure 3. 5’ End Analysis of Terminally Labeled a-Amylase mRNAs

Determination of this 5’ terminal nucleotide in pancreas (A) and salivary gland (C) a-amylase mRNAs: total nuclease Pl digestion products were separated two-dimensionally with marker nucleotides as described by Marcu et al. (1978). (1) and (2) indicate the first and second dimension, respectively. “Wandering spot” analysis of 5’ end-labeled pancreas (B) and sali- vary gland (D) a-amylase mRNAs: partial digestion products were generated in the presence of nuclease Pl First dimension (1): cel- lulose acetate electrophoresis in pyridinium-acetate buffer (pH 3.5). 7 M urea. Second dimension (2): homochromatography on PEl-cel- lulose thin layers using “homomix” Ill (Jay et al., 1974) for pancreas or “homomix” C (Platt and Yanofsky, 1975) for salivary gland.

10% SDS-polyacrylamide gels (M. Tosi and L. Fabi- ani, personal communication), we conclude that syn- thesis of the a-amylase precursor polypeptide begins at the second AUG codon relative to the 5’ terminus of salivary gland a-amylase mRNA (Figure 5, residues 104-l 06). We could not determine whether the first AUG codon in this mRNA initiates translation of the dipeptide L-methionyl-L-lysine in an in vitro transla- tion system (Paterson, Roberts and Kuff, 1977) be-

Figure 4. Rapid RNA Sequence Anal- ysis of Pancreas a-Amylase mRNA

Rapid RNA sequencing was performed according to the method of Donis- Keller et al. (1977). (OH) Partial alka- line hydrolysis; ~TI) RNAase Tl; (UZ) RNAase U2 digestion. The first residue determined is numbered according to its position in Figure 5. (Py) indicates a pyrimidine nucleotide; (XI and (B) denote the position of xylene cyanol FF and bromophenol blue, respec- tively.

cause this dipeptide, added to the cocktail as a marker, was completely degraded after a short incu- bation (data not shown). Since all other eucaryotic cellular mRNAs for which complete terminal sequence information is available initiate translation at the AUG codon located nearest the 5’ end (Kozak, 1978) the mouse salivary gland a-amylase mRNA is an apparent exception. These data suggest that the eucaryotic ribosome can initiate protein synthesis at an AUG codon translationally downstream from the AUG clos- est to the 5’ terminus, at least when the two are separated by a termination triplet.

M. Goldberg and D. Hogness (personal communi- cation) have noted that the sequence TATAAATA or some variant thereof precedes the postulated tran- scription initiation site (the cap site) by approximately 23 residues. Since the pancreas and salivary gland a-amylase genes may share a common ancestor, it is interesting that a slight variation of the octanucleotide sequence (AAUAAAUU) occurs in the salivary gland mRNA 20 residues preceding the position where the pancreas mRNA 5’ end aligns with it (see Figure 5). If the canonical sequence has a role in transcription initiation, it must specify only a portion of the neces- sary information since considerable variation is ap-

Figure 5. Nucleotide Sequence of a-Amylase mRNAs from Mouse Salivary Gland and Pancreas

Differences in the pancreas a-amylase mRNA and protein are presented below the salivary gland d-amylase mRNA and protein sequences. These were deduced from cDNA sequence analysis (positions 68-I 660 for salivary gland, 96-l 663 for pancreas), primer extension experiments (positions 2-85 for salivary gland, 79-l 36 for pancreas) and direct RNA sequencing (positions l-4 for salivary gland, 77-80 for pancreas), as described in the text. Residues are numbered with respect to the 5’ terminal nucleotide of salivary gland mRNA (the pancreas mRNA sequence begins at position 77). The nine nucleotide segment absent in the pancreas species is denoted (---I. The two mRNAs were assumed to be capped by m7G. (A.) denotes poly(A). The translation initiation and termination signals for a-amylase are boxed: the 5’ proximal AUG in the salivary gland mRNA is underlined.

Mouse a-Amylase mRNAs 183

MetLysPhePheLeuLeLiLeSerLeu

7G~~~A~CUCAACGU~AUCAGAAGkUUCCAACUG~AGUGGAAGG~AGCAC~A~UA~uUAC~UGUUAGAAA~AAuAcuGcc~cAGcAuAG~~~~cuuccuGC~Gc~u~cccu~ m7GpppdmA UU A G GU

VC21

IluClyPheCysTrpAlaGlnTyrA~spProHisTh~lnTy~lyArgThrAlaIlLcrluHisLellPheGluTrpArgTrpValAspIluAlaLysGluCysclu~rgTyrL~~la~~~

AUUGGAUUC~GCUGGGCCC~AUAUGACCC~CAUACUCAA~AUGGACGAA~UGCUAUUAU~CACCUGUUU~AGUGGCGCU~GGUUGAUAu~GCuAAGG~~G~GAGAGAu~cuuAGcucc~ G u UC G GA G G C C C

&i-Asp vu1

AsriGlyPheAlaGlyVa1GlnValSerProProAsnGluAsnIluValValHisSe~~ProSerArgProTrpTrpGl~rgTyrGlnpr~Il~er~rLysIl~Cy~S~rArgS~rGly

AA~GGA~~~~~AGG~G~G~AGG~~~~~~~~~~~AA~GAA~~A~~G~AG~~~~A~AG~~~~~~AAGA~~A~~GGUGG~~~AUAUC~CC~UUAGCUAC~~UAUGUU~CAGG~CUGG~ G G G GU U UA A U C c u C CAA

LYE GUY Vii1 ASiZ Thr

AsnCluAspGlliPheA~AAspMetValA~~rgCysAs~snValGlyValArgIl~yrVulAspAlaValIluAsniiisMetCysClyValGlyAlaGl~laClyGlnSerSerThr

AA~G~GA~~AA~U~AGGG~~A~GG~~A~AGGUG~AA~~A~G~~GG~G~~~G~AU~~A~G~G~~G~U~U~A~~AA~~~~AUGUGUGGAG~GGGGG~U~~GCUGGACA~GCAGUAC/\ A CA C CA CAA CU A AC C

Thr Ala AsnPro Thr

CysClySerTyrPheAsnProAs~snArgAspPheProGlyValProTyrSerGlyPheAspPheAsnAspGlyLysCysArgThrAlaSerGlyGlyIluGluAsnTyrClnAspAla

UGUGGAAGUUA~~UCAACUCCAAAUARCAGGGA~U~CC~~AGUUCCCUAUUCUGGUUUUGACUUURAUGAUG~~U~UAGAACUGCHAGUG~GGU~UCGAGAACU~CC~GAUGC~ A C C GG C AAU ------A AA U U AU

Lecl Glu Ala AkTrp A.%? A _----___ -sn Glu ASP ASFl

AlaClnVa/alArgAspCysArgLe~e~lyLeuLe~spLe~l~euGl~ysA8pTyrValArgT~rLysValAl~~pTyrMetAsnHi~Le~l~spIluGlyValAlaGlyPhe

GCUCAGGUC~GAGAUUGUC~UCUGUCUGG~CUUCUGGAU~UUGCACUU~GAAA~UUA~GUUCG~CC~AGGUGGCUG~CUAUAUG~~CAUCUCAUU~ACAUUGGCG~AGCAGGGUU~ UA A A U A ?4r As77 Thr

ArgLeuAspAlaSerLysHisMetTrpProClyAspll~ysAlalluLe~spLysLe~isAsn~~snThrLysTrpPheSerClnGlySerArgProPheIl~heGlnGluVal

AGACUUGAU~CUUCUAAGCACAUGUGGCC"GGAGACAUAAAGGCAAUUUUGGACAAACUGCAUAAUCUC~UAC~AU~GUUCUCCCAAGGAAGCAGAtCUUUCAUUUI;CCAAGAGGU~ G A G C Ala A?? VLll

Il~spLeuGlyGlyGluAlaValSerSerAsnGl~yrPheGlyAsnGlyArgValTh~luPheLysTyrGlyAlaLysLeuGlyLysVa/aetArgLysTrpAspGlyClilLysMet

AuuGAUcuG~GuGGuGAGG~AGUGUCAAG;AAUGAGUAU~UUGGAAAUG~CCGUGUGAC~GAAUUCAAA;rAUGGAGCAA~AUUGGGCAAaGUUAUGCGC~AGUGGGAUG~AGAAAAGAU~ A UAA G G C G U cu c C A C G IluLysGlySer Thr Ilu Asn

S~~y~~e~Ly~A~~TrpGlyGl~GlyT~Gly~~~tProSerAspArgAlaLeuValPheValAspAsnHisAspAsnGl~~GlyHisGlyAl~GlyGlyAlaSerIluL~uThr

UCCuACuuA~AGAACUGGGhAGAAGGUUG~GGUUUGAUGtCUUCUGACA~GCCCUUGU~UUUGUGGACkACCAUGACAkUCAGCGAGG~CAUGCUGCU~GGGGAGCAUiCAUCUUGAC~ U G U A U C

Val Ser

PheT~AspAl~rgLe~yrLysMetAlaValGlyPheMetLeuAloHisProTyrGlyPheThrArgValM~tSerSerTyrTyrTrpPmArgAsnP~Gl~snGlyLysAspVal

120

240

360

480

600

720

840

960

1080

UUCUGGGAUGCUAGACUCUAUAAAAUGGCUGUUGGCUUUAUGUUGGCUCAUCCUUAUGGUUUCACACGGGUGAUGUCAAGUUACUAUUGGCCAAGAAAUUUCCAGAAUGGAAAAGAUGUC 1200 AG C A A AA A CG AAU CAG Met Arg AWl Gin

AsnAspTrpValGlyProProAsnAsnAsnCtyLysThrLysCluValSerIluAsnProAspSeliPhrCysGlyAsnAspTrpIluCysGluHisArgT~ArgGl~lluArgAsnMet

AAUGACUGG;;UUGGACCAC;AAAUAACAAGGGAAAAACCAAAGAAGUGAC:CAUUAACCC~GACAGCACUirGUGGCAAUG~CUGGAUCUG;GAACA"CGA;GGCGUCAAA;AAGGAACAU~ 1320 A C GU A C UG CU G CA C Ilu Val Thr Ala Thr vu1

ValAlaPheArgAsnValV~lAsnGlyGl~roPheAlaAsnTrpTrpAspAsnAspSerAsnClnValAlaPheGlyArgGlyAsnLysGlyLeulZul/aZPheAsnAs~spAspT~

GUUGCCUUCriGAAAUGUCG~CAAUGGUCA~CCUUUUGCA~ACUGGUGGGAUAAUGACAG~AACCAGGUA~CUUUUGGCA~AGGA~CA~GGACUCAUU~UCUUUAACAi\UGAUGACUG~ 1440 G G U A A A G U

&?P A.Sn Se?- Av Phe

AlaLeuSerGluThrLeuGlnThrGlyLeuPmAlaGlyThrTyrCysAspValIl~erGlyAepLysVa/al~spGlyAsnCy~ThrGlyIlliLysValTyrValGlyAsnAspGlyLys

GCUUUGUCA~AACUUUAC6GACUGGUCU~CCUGCUGGC~CAUACUGUG~UGUCAUUUC~GGA~UAAA~UCGAUGGC~UUGCACUG~AU~~G"C~AUGUUGGC~UGAUGGCAA~ 1560 cc C G CUG GA G

Ala LWlrg ASa Ser

AlaHis?heSerIluSerAsnSerAlaCluAspProPheIluAlalluiiisAlaG1~erLysIlu

GCUCACUUU;CUAUUAGUA~CUCUGCCGA~GACCCAUUUj\UUGCAAUCC;\UGCAGAGUCkAAAAU~AUUUAAA" C U G u c IJG G A CU U U AGAGAU

ASP GAUUA G UCA,,

L@U

Cell 184

parently permitted within the octanucleotide, and such sequences occur in the 5’ end region of salivary gland a-amylase mRNA and have been noted in intervening sequences (Van Ooyen et al., 1979).

The Coding Regions The mRNA sequence data presented here predict that 508 and 511 amino acid cY-amylase precursor poly- peptides are translated in the mouse pancreas and salivary glands, respectively. A segment of nine nu- cleotides (residues 563-571, Figure 5), present in the salivary gland but not in the pancreas mRNA, ac- counts for the additional three amino acids predicted for salivary gland cY-amylase precursor. Several ob- servations indicate that the predicted proteins are actually translated from the pancreas and salivary gland mRNAs. -Only one frame is sufficiently devoid of termination triplets to encode a protein the size of a-amylase. -The predicted proteins have lengths identical to those observed for pancreas and salivary gland (Y- amylase mRNA translation products in vitro (M. Tosi and L. Fabiani, personal communication). -The amino acid composition predicted for precur- sor cw-amylase is essentially identical to that deter- mined empirically for mouse pancreas cy-amylase (Danielsson, Marklund and Stigbrand, 1975; Table 1). -Finally, we predict that both proteins possess a putative signal sequence, a characteristic stretch of hydrophobic amino acid residues at the N terminus found in similar form in polypeptide precursors des- tined to be secreted (Blobel and Dobberstein, 1975; Habener et al., 1978). The predicted signal sequences are quite similar to those determined for rat salivary gland precursor ol-amylase (Figure 6; Gorecki and Zeelon, 1979).

We observe 10% nucleotide sequence heteroge- neity in the coding portion of the two a-amylase mRNAs. 28, 17 and 55% of these nucleotide differ- ences occur in the first, second and third positions of the codon, respectively. These differences account for changes in 12% of the amino acid residues in the two a-amylases. Nucleotide and amino acid changes occur rather uniformly throughout the coding region, with the exception of the portion between residues 450 and 600 (Figure 5). This region corresponds exactly to the 150 nucleotide “bubble” observed in electron micrographs of hybrids between pancreas a-amylase mRNA and the cloned cDNA of its salivary gland counterpart (Schibler et al., 1980). The concen- trated variation in amino acid residues translated from the nucleotide 450-600 region (Figure 5) might influ- ence the efficiency of the enzyme, or may simply reflect the presence of a portion of the protein which can be altered significantly without affecting the active site.

Nonrandom codon usage has been observed for

Table 1. Amino Acid Composition of Mouse u-Amylases

Predicted from mRNA: Determined:

Salivary Amino Acid Pancreas Gland Pancreas

Ala 33 33

Aw 31 28

Asn 47 41

ASP 35 36

CYS 12 12

Gln 13 16

Glu 19 20

GIY 47 51

His 13 13

IIU 24 27

Leu 29 28

LYS 23 25

Met 11 12

Phe 24 27

Pro 20 21

Ser 30 32

Thr 21 15

Trp 19 18

Tyr 18 20

Val 39 36

33.3

32.8

286.9

16.1

>34.7

50.1

12.8

21.3

25.3

22.7

10.2

24.6

21.6

29.4

21.7

15.8

18.4

39.0

The predicted number of each amino acid in the pancreatic and salivary gland a-amylases is compared to that determined by Dan- ielsson et al. (1975) for mouse pancreatic wamylase.

most eucaryotic mRNAs sequenced to date, and the mouse cY-amylase mRNAs we have examined are no exception. Table 2 shows that pancreas and salivary gland a-amylase mRNAs, like other eucaryotic mRNAs, exhibit a striking deficiency in codons con- taining CG doublets. This phenomenon is not re- stricted to translated sequences in j?-globin genes (Konkel, Maize1 and Leder, 1979; Van Ooyen et al., 1979) and immunoglobulin light chain genes (Bernard, Hozumi and Tonegawa, 19781, suggesting that the basis for this codon bias is not necessarily transla- tional. Because some mRNAs are actually quite rich in CG doublets (Salser, 1977; Jones et al., 1979; Nakanishi et al., 1979). the functional basis for such codon bias remains speculative.

The 3’ Noncoding Regions The 36 and 33 nucleotide segments which comprise the 3’ nontranslated portions of pancreas and salivary gland a-amylase mRNAs, respectively, are unchar- acteristically short when compared to those of other eucaryotic cellular mRNAs (Proudfoot and Brownlee, 1976; Seeburg et al., 1977; Shine et al., 1977; Ullrich et al., 1977; McReynolds et al., 1978; Nakanishi et

Mouse a-Amylase mRNAs 185

Rat S

Mouse S

Mouse P

Met ---

1 Met Lys

Met Lys

lherrn--- Leu ::: Figure 6. a-Amylase Precursor Signal Se-

Phe Phe Leu Leu Leu Ser Leu Ilu Gly Phe The mouse a-amylase precursor signal pep-

quences

tides predicted by the pancreas and salivary

Phe Val Leu Leu Leu Ser Leu Ilu Gly Phe . . . gland mRNA sequences are shown below the rat salivary a-amylase precursor amino acid residues determined by Gorecki and Zeelon (1979). Amino acids common to all three sig- nal peptides are boxed. (- - -) Undetermined amino acids; W salivary; (P) pancreas.

Table 2. Codon Usage in Mouse Pancreatic and Salivary Gland a-Amylase mRNAs

-

U

C

1

A

G

-

U c

Phe 13 10 Ser 9 7 Phe 14 14 Ser 5 5 Leu 3 3 Ser 5 7 Leu 6 5 Ser 0 0

Leu 7 8 Pro 9 9 Leu 5 4 Pro 2 2 Leu 0 1 Pro 10 9 Leu 7 8 Pro 0 0

Ilu 16 16 Thr 7 8 Ilu 6 7 Thr 2 3 Ilu 5 1 Thr 6 10 Met 12 11 Thr 0 0

Val 9 11 Ala 18 19 Val 12 12 Ala 4 4 Val 3 5 Ala 11 10 Val 12 11 Ala 0 0

S P S P

A G

Tyr 14 10 CYS 9 8 Tyr 6 8 CYS 3 4 UAA 1 1 UGA 0 0 UAG 0 0 Trp 18 19

His 8 8 Aw 4 6 His 5 5 Aw 2 2 Gln 10 6 Arg 4 2 Gln 6 7 Arg 1 0

Asn 23 29 Ser 6 6 Asn 18 18 Ser 7 5 Lys 18 12 Arg 12 15 LYS 7 11 Am 5 6

Asp 19 20 Gly 12 10 Asp 17 15 Gly 11 11 Glu 12 11 Gly 25 22 Glu 8 8 GIY 3 4

S P S P

U c A G

U C A G

-3

U c A G

U C A G

The frequency of codon usage is given for pancreas (P) and salivary gland (S) a-amylases. The large numbers 1, 2 and 3 designate the first, second and third residues of the codon. respectively.

al., 1979). Although this portion of the two cr-amylase mRNAs is AU-rich (70%). there is very little apparent sequence homology. While present in the salivary gland cw-amylase mRNA, the AAUAAA hexanucleotide found approximately 20 nucleotides from the 3’ poly(A) tract in all eucaryotic cellular mRNAs se- quenced thus far (Proudfoot and Brownlee, 1976) is absent in its pancreatic counterpart; it is replaced by AUUAAA. The functional significance of this obser- vation is not clear. Finally, it is interesting that within the heterogeneous 3’ end region, the hexanucleotide sequence AGCAUC appears adjacent to the poly(A) in both mRNAs. This hexanucleotide shares no strik- ing homology with comparable regions of other eu- caryotic mRNAs.

For clarity, we have summarized the salient features of the pancreatic and salivary gland mRNAs in Figure 7.

Are Multiple cr-Amylase Genes Expressed in Mouse Pancreas or Salivary Gland Tissues? Several reports have postulated that multiple cr-amy- lase genes are expressed in both pancreatic and

salivary gland tissues (Sick and Nielsen, 1964; Kaplan et al., 1973; Nielsen, 1974, 1977). However, these postulates were based on the assumption that isoen- zyme gel mobility differences reflect differences in the mRNA from which they were translated. Several ob- servations argue that not more than one major (Y- amylase mRNA species accumulates in either of these two tissues. First, hybrids between cw-amylase mRNA and their cloned cDNA counterparts melt coopera- tively and are not digested by Si nuciease (Schibler et al., 1980). Second, a single low mobility cDNA is generated by primer extension of total polyadenylated mRNA from either tissue (Figure 2A); Maxam and Gilbert sequence analysis of the cDNAs produces a unique sequence (Figure 28). Lengthy X-ray film ex- posures of the preparative cDNA gel reveal several discrete cDNA species (all of which account for less than 10% of the total incorporated label) which are smaller than the major one we analyzed. Sequence analysis of each of the smaller species generated in the presence of pancreatic mRNA produces a unique, truncated version of the sequence determined for the major cDNA species, suggesting that the smaller

Cell 188

S

P

5’ ~CODING CODING 3: NWODING

I I

I I ~A4lM4-AGcAuc(A,, I I I I

93 I I

m7GwAmC-AWWI\--

m7GwGmA-- ~AU!.NW&CAUC(A), I

I I 17 I l524 I 36

I I

cDNAs represent premature transcriptase termination products. Finally, direct sequence analysis of the pan- creatic a-amylase mRNA produces a unique se- quence for the 5’ terminal 25 nucleotides (Figure 4). Together, these data strongly suggest that only one major species of wamylase mRNA accumulates in the pancreas or salivary gland.

Experimental Procedures

Materials The restriction enzyme Pst I was supplied by G. Fey (ISREC. Lau- sanne); other DNA endonucleases were purchased from New England Biolabs; T4 polynucleotide kinase was obtained from PL Biochemi- cals; and bacterial alkaline phosphatase (BAP). purified according to the method of Malamy and Horecker (1964). was a gift from J. Chlebowski (Virginia Commonwealth University). Ribonucleases Tl and U2 and nuclease Pi were purchased from Calbiochem. and RNAase H was from Enzo Biochem. Y-~‘P-ATP was from Amersham (spec. act. < 2000 Ci/mmole). The recombinant plasmids pMPa21 and pMSa104 were gifts from U. Schibler and M. Tosi (ISREC. Lausanne), respectively, and were handled under P3/EKl contain- ment. AMV reverse transcriptase was provided by J. Beard (Office of Program Resources and Logistics, Viral Cancer Program, National Cancer Institute). DNA and RNA were prepared according to the procedures of Schibler et al. (1980).

DNA Sequencing DNA fragments were sequenced using the chemical modification procedures of Maxam and Gilbert (1977). Cleavage products were analyzed by electrophoresis on 40 cm long, 0.5 mm thick, 20, 8 and 6% polyacrylamide-7 M urea gels.

cDNA Synthesis The primer was produced by isolating the Pst I insert of pMSal04 DNA, digesting this fragment with the restriction enzyme Hha I. 5’ end-labeling with T4 polynucleotide kinase and T-~*P-ATP and cleav- ing with Hinf I. 7 pmole of primer and 5 cg of poly(A)+ pancreas mRNA [or 20 pg of poly(A)+ salivary gland mRNA] were heated for 5 min at 50°C in 45 ~1 of 90% formamide. 5 pl of a solution containing 4 M NaCI. 10 mM EDTA and 0.1 M PIPES (pH 6.4) were added and the sample was incubated for 1 hr at 48°C (1 hr at 54°C for salivary gland mRNA). The sample was ethanol-precipitated and resuspended in 20 gl of 5.5 mM Tris-HCI (pH 8.4). 7.5 mM KCI, 7.5 mM MgCI. and 10 mM dithiothreitol. dXTPs were added to 20 mM. 5 I-LI AMV reverse transcriptase (6 UYpI) were added and the mixture was incubated at 42’C for 1 hr. The products were phenol-extracted, ethanol-precipi- tated and resuspended in 50 pl of 100 mM NaOH, 1 mM EDTA. The sample was heated to 42°C for 3 min and quick-chilled, and 50 gl of 8 M urea were added. Electrophoresis was on an 8% acrylamide

Figure 7. Features of the mRNAs

Diagram of the salivary gland(S) and pancreas (P) mRNAs showing the two nucleotides fol- lowing the m’G cap; the boxed AUG and UAA codons from which initiation and termination of a-amylase synthesis occur; a salivary gland mRNA sequence containing the 5’ proximal AUG from which a dipeptide may be synthe- sized; the nine nucleotide segment GAA- CUGCAA present in salivary gland but not pancreas u-amylase mRNA; and two essen- tially conserved hexanucleotides in the 3’ non- coding region. The thick line represents trans- lated sequences, the thin line represents the nontranslated ones. Numbers are in nucleo- tides.

TBE-urea gel, and the samples were run so that DNA species differing in length by a single nucleotide were resolved.

mRNA Labeling Chemical removal of mRNA caps was accomplished through a modi- fication (R. Kamen. personal communication) of the method of Fraen- kel-Conrat and Steinschneider (1968). 20 pg of pancreatic or 60 pg of salivary poly(Atcontaining RNA were incubated in 45 pl of a solution containing 0.1 M sodium acetate (pH 5.0), 0.01 M EDTA and 0.01 M NalO., in the dark at 0°C for 2 hr. After ethanol precipitation and a 70% ethanol wash, the oxidized RNA was resuspended in 10 pl of HP0 and mixed with 90 ~1 of 0.33 M aniline HCI (pH 5.0); p- elimination was allowed to proceed for 2 hr at 24’C; and the reaction was terminated by ethanol precipitation. The RNA pellet was washed with 70% ethanol to remove residual aniline HCI. The decapped mRNA was dephosphorylated with an incubation in 20 ~1 of 20 mM Tris-HCI (pH 8.0). 0.1% SDS with 0.5 units bacterial alkaline phos- phatase for 30 min at 37’C. The sample was phenol-extracted, ethanol-precipitated and washed with 70% ethanol, and its 5’ termi- nus was labeled with Y-~‘P-ATP and T4 polynucleotide kinase ac- cording to the method of Maxam and Gilbert (1977). except that the pH of the reaction was 8.0 and the incubation time was 15 min. The 5’ end-labeled RNA was phenol-extracted and ethanol-precipitated twice before being subjected to electrophoresis on a 2% agarose methylmercury-hydroxide slab gel (Bailey and Davidson, 1976; Schi- bier, Marcu and Perry, 1978). The a-amylase mRNA band, visualized by ethidium bromide staining, was electrophoretically eluted. The eluted mRNA was purified further by oligo(dT) chromatography (Schi- bler et al., 1980). RNAase H digestions were performed as described by Schibler et al. (1980).

mRNA Sequence Analysis 5’ terminal nucleotide determinations and “wandering spot” analysis of 5’ end-labeled mRNA were carried out as described by Lockard and RajBhandary (1976). Analysis of Pl nuclease digestion products was accomplished with the two-dimensional chromatography system of Marcu, Schibler and Perry (1978). Rapid RNA sequence analysis was as described by Donis-Keller. Maxam and Gilbert (1977).

Acknowledgments

We are grateful to Drs. Peter Wellauer, Ueli Schibler. Mario Tosi and Bernhard Hirt for helpful discussions and support. We also thank Anne-CBcile Pittet for plasmid pMPa21 mapping data, and Sophie Cherpillod for help with the manuscript. This research was supported by a grant from the Swiss National Science Foundation.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received March 24, 1980

Mouse u-Amylase mRNAs 187

References cDNA for bovine corticotropin+lipotropin precursor. Nature 278, 423-427.

Nielsen, J. T. (1974). Pancreatic amylase polymorphism in the house mouse. A model involving several loci. Hereditas 78, 322.

Nielsen, J. T. (1977). Variation in the number of genes coding for salivary amylase in the bank vole, Clefhfionomys glareola. Genetics 85, 155-l 89.

Paterson, 8. M., Roberts, B. E. and Kuff, E. L. (1977). Structural gene identification and mapping by DNA-mRNA hybrid-arrested cell- free translation. Proc. Nat. Acad. Sci. USA 74, 4370-4374.

Platt, T. and Yanofsky, C. (1974). An intercistronic region and ribo- some-binding site in bacterial mRNA. Proc. Nat. Acad. Sci. USA 72, 2399-2403.

Proudfoot, N. and Brownlee, G. G. (1978). 3’ non-coding region sequences in eukaryotic messenger RNA. Nature 283, 21 l-21 4.

Salser, W. (1977). Globin mRNA sequences: analysis of base pairing and evolutionary implications. Cold Spring Harbor Symp. @ant. Biol. 42, 985-1002.

Sanders, T. G. and Rutter. W. J. (1972). Molecular properties of rat pancreatic and parotid a-amylase. Biochemistry 11, 130-l 38.

Schibler, U.. Marcu. K. B. and Perry, R. P. (1978). The synthesis and processing of the messenger RNAs specifying heavy and light chain immunoglobulins in MPC-11 cells. Cell 75, 1495-l 509.

Schibler. U.. Tosi. M., Pittet. A.-C. and Wellauer. P. K. (1980). Tissue specific expression of mouse a-amylase genes. J. Mol. Biol., in press.

Seeburg, P. H., Shine, J., Martial, J. A., Baxter, J. D. and Goodman. H. M. (1977). Nucleotide sequence and amplification in bacteria of structural gene for rat growth hormone. Nature 270, 488-494.

Shine, J.. Seeburg, P. H.. Martial, J. A., Baxter, J. D. and Goodman, H. M. (1977). Construction and analysis of recombinant DNA for human chorionic somatomammotropin. Nature 270, 494-499.

Sick, K. and Nielsen, J. T. (1964). Genetics of amylase isozymes in the mouse. Hereditas 51, 291-296.

Smith. l-t. 0. and Birnstiel. M. L. (1976). Simple method for DNA restriction site mapping. Nucl. Acids Res. 3, 2387-2398.

Takeuchi. T., Matsushima, T. and Sugimura. T. (1975). Electropho- retie and immunological properties of liver a-amylase of well-fed and fasted rats. Biochim. Biophys. Acta 403, 122-l 30.

Ullrich, A., Shine, J.. Chrigwin. J., Pictet. R., Tischer, E., Rutter, W. U. and Goodman, H. M. (1977). Rat insulin genes: construction of plasmids containing the coding sequences. Science 796, 1313- 1319.

Van Goyen. A., van den Berg, J.. Mantei. N. and Weissman, C. (1979). Comparison of total sequence of a cloned rabbit p-globin gene and its flanking regions with a homologous mouse sequence. Science 206. 337-344.

Bailey, J. M. and Davidson, N. (1978). Methylmercury as a reversible denaturing agent for agarose gel electrophoresis. Anal. B&hem. 70, 75-85.

Barrell, B. G. (1971). Fractionation and sequence analysis of radio- active nucleotides. Prog. Nucl. Acid Res. 2. 751-779.

Bernard, O., Hozumi. N. and Tonegawa. S. (1978). Sequences of mouse immunoglobulin light chain genes before and after somatic changes. Cell 15, 1133-l 144.

Blobel. G. and Dobberstein, B. (1975). Transfer of proteins across membranes. I. Presence of proteolytically processed and unproc- essed nascent immunoglobulin light chains on membrane-bound ri- bosomes of murine myeloma. J. Cell Biol. 67, 835-851.

Danielsson. A., Marklund. S. and Stigbrand, T. (1975). Purification and characterization of mouse pancreatic a-amylase. Int. J. Biochem. 6, 585-589.

Donis-Keller, H., Maxam, A. M. and Gilbert, W. (1977). Mapping adenines, guanines and pyrimidines in RNA. Nucl. Acids Res. 4. 2527-2538.

Fraenkel-Conrat, H. and Steinschneider. A. (1968). In Methods in Enzymology, 72, L. Grossman and K. Moldave, eds. (New York: Academic Press), pp. 243-246.

Gorecki, M. and Zeelon. E. P. (1979). Cell-free synthesis of rat parotid preamylase. J. Biol. Chem. 254, 525-529.

Habener, J. F., Rosenblatt. M., Kemper, B., Kronenberg, H. M:! Rich, A. and Potts, J. T. (1978). Pre-proparathyroid hormone: amino acid sequence, chemical synthesis, and some biological studies of the precursor region. Proc. Nat. Acad. Sci. USA 75. 2816-2620.

Hagenbuchle. O., Santer. M., Steitz, J. A. and Mans, R. J. (1978). Conservation of the primary structure at the 3’ end of 18s rRNA from eucaryotic cells. Cell 13, 551-563.

Hammerton. K. and Messer, M. (1971). The origin of serum amylase. Electrophoretic studies of isoamylases of the serum, liver and other tissues of adult and infant rats. Biochim. Biophys. Acta 244, 441- 451.

Jay, E., Bambara. R., Padmanabhan, R. and Wu. R. (1974). DNA sequence analysis: a general, simple and rapid method for sequenc- ing large oligodeoxyribonucleotide fragments by mapping. Nucl. Acids Res. 1, 331-353.

Jones, C. W., Rosenthal, N.. Rodakis, G. C. and Kafatos. F. C. (1979). Evolution of two major chorion multigene families as inferred from cloned cDNA and protein sequences. Cell 78, 1317-l 332.

Kaplan, R.‘D., Chapman, V. and Ruddle, F. H. (1973). Electrophoretic variation of a-amylase in two inbred strains of Mus musculus. J. Hered. 64, 155-l 57.

Konkel. D. A., Maizel. J. V., Jr. and Leder, P. (1979). The evolution and sequence comparison of two recently diverged mouse chromo- somal fl-globin genes. Cell 18, 865-873.

Kozak. M. (1978). How do eucaryotic ribosomes select initiation regions in messenger RNA? Cell 15. 1109-l 123.

Lockard. R. E. and RajBhandary, U. L. (1976). Nucleotide sequences at the 5’ termini of rabbit LI- and P-globin mRNA. Cell 9, 747-760.

McReynolds. L., O’Malley. B. W.. Nisbet. A. D.. Fothergill, J. E.. Givol, D.. Fields. S.. Robertson, M. and Brownlee, G. G. (1978). Sequence of chicken ovalbumin mRNA. Nature 273, 723-728.

Malamy. M. H. and Horecker, B. L. (1964). Release of alkaline phosphatase from cells of E. coli upon lysozyme spheroplast forma- tion. Biochemistry 3, 1889-l 893.

Marc& K. B., Schibler. U. and Perry, R. P. (1978). The 5’-terminal sequences of immunoglobulin mRNA of a mouse myeloma. J. Mol. Biol. 720, 381-400.

Maxam. A. M. and Gilbert. W. (1977). A new method for sequencing DNA. Proc. Nat. Acad. Sci. USA 74, 560-564.

Nakanishi. S.. Inoue. A.. Kita. T.. Nakamura. M.. Chang. A. C. Y., Cohen, S. N. and Numa. S. (1979). Nucleotide sequence of cloned