SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

7
Int. J. Peptide Protein Res. 11, 1978 42-48 Published by Munksgaard, Copenhagen, Denmark No part may be reproduced by any process without written permission from the author(s) SEQUENCE COMPARISON OF HUMAN PLASMA CXI-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS K. SCHMID, J. EMURA, M.F. SCHMID, R.L. STEVENS and R.B. NIMBERG Department of Biochemistry, Boston University School of Medicine, Boston University Medical Center, Boston, U.S.A. Received 10 May, accepted for publication 8 July 1977 The amino acid sequence of human plasma a1 -acid glycoprotein, upon com- parison with the sequences of other blood proteins, was shown to possess significant similarity with the immunoglobulins. Employing direct and corrected sequence identity, the average mutation value and two different computer comparisons for the evaluation of sequence similarity, the following two regions of this &-globulin, which account for approximately half of the total amino acid sequence of the protein, were found to possess sequence similarity with the immunoglobulins. a) The region from residues 77 through 125 proved to be related to the variable region of several human Hand L chains, and b) the region from residues I36 through I66 was found to be related not only to the constant region of a human and a mouse L chain but also to the third and fourth constant region o f a rabbit and a human H chain, respectively. These results suggest that al -acid glycoprotein is probably related to the immunoglobulins and further suggest that it possibly diverged from the immunoglobulin evolutionary tree prior to the formation of the primitive L chain. Key words: 01 -acid glycoprotein; computer methods; immunoglobulins; sequence comparison al -Acid glycoprotein (AGP*), a human plasma globulin (for review see Schmid, 1975) whose linear amino acid sequence has recently been established (Schmid et al., 1973), is one of the most extensively characterized blood proteins. Earlier, it was demonstrated that this glycopro- tein occurs as two major, genetically transmitted variants (Schmid et al., 1965; Johnson et al., 1969) whose electrophoretic differences can be accounted for by a single amino acid substitution (Nimberg et al., 1971). In addition, a large number of other amino acid replacements have been elucidated (Schmid et al., 1973). The presence of multiple amino acid substitutions * The abbreviation used was: AGP, a,-acid glycopro- tein 42 suggested a possible relationship between this glycoprotein and the immunoglobulins (Schmid et al., 1973). In the present paper the results of an analysis for sequence similarity between AGP and the immunoglobulins are described. A hypothesis regarding the origin of this al-globulin is also presented. MATERIAL AND METHODS AGP was isolated from pooled normal human plasma according to a procedure described earlier (Biirgi & Schmid, 1961). The apparent homogeneity of the resulting protein prep- aration was established by physicochemical, chemical and immunochemical criteria (Schmid, 1975).

Transcript of SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

Page 1: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

Int. J. Peptide Protein Res. 1 1 , 1978 42-48 Published by Munksgaard, Copenhagen, Denmark No part may be reproduced by any process without written permission from the author(s)

SEQUENCE COMPARISON OF HUMAN PLASMA CXI-ACID GLYCOPROTEIN A N D THE IMMUNOGLOBULINS

K . SCHMID, J . EMURA, M.F. SCHMID, R.L. STEVENS and R.B. NIMBERG

Department of Biochemistry, Boston University School o f Medicine, Boston University Medical Center, Boston, U.S.A.

Received 10 May, accepted for publication 8 July 1977

The amino acid sequence o f human plasma a1 -acid glycoprotein, upon com- parison with the sequences of other blood proteins, was shown to possess significant similarity with the immunoglobulins. Employing direct and corrected sequence identity, the average mutation value and two different computer comparisons for the evaluation of sequence similarity, the following two regions o f this &-globulin, which account for approximately half o f the total amino acid sequence of the protein, were found to possess sequence similarity with the immunoglobulins. a) The region from residues 77 through 125 proved to be related to the variable region o f several human Hand L chains, and b) the region from residues I36 through I66 was found to be related not only to the constant region of a human and a mouse L chain but also to the third and fourth constant region o f a rabbit and a human H chain, respectively. These results suggest that al -acid glycoprotein is probably related to the immunoglobulins and further suggest that it possibly diverged from the immunoglobulin evolutionary tree prior to the formation of the primitive L chain. Key words: 0 1 -acid glycoprotein; computer methods; immunoglobulins;

sequence comparison

al -Acid glycoprotein (AGP*), a human plasma globulin (for review see Schmid, 1975) whose linear amino acid sequence has recently been established (Schmid et al., 1973), is one of the most extensively characterized blood proteins. Earlier, it was demonstrated that this glycopro- tein occurs as two major, genetically transmitted variants (Schmid et al., 1965; Johnson et al., 1969) whose electrophoretic differences can be accounted for by a single amino acid substitution (Nimberg et al., 1971). In addition, a large number of other amino acid replacements have been elucidated (Schmid et al., 1973). The presence of multiple amino acid substitutions

* The abbreviation used was: AGP, a,-acid glycopro- tein

42

suggested a possible relationship between this glycoprotein and the immunoglobulins (Schmid et al., 1973).

In the present paper the results of an analysis for sequence similarity between AGP and the immunoglobulins are described. A hypothesis regarding the origin of this al-globulin is also presented.

MATERIAL AND METHODS

AGP was isolated from pooled normal human plasma according to a procedure described earlier (Biirgi & Schmid, 1961). The apparent homogeneity of the resulting protein prep- aration was established by physicochemical, chemical and immunochemical criteria (Schmid, 1975).

Page 2: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

RELATIONSHIP OF AGP WITH THE IMMUNOGLOBULINS

TABLE 1 Sequence similarity between human plasma a, -acid glycoprotein and the variable and constant regions of

the L and H chains of various mammalian immunoglobulins

Alignment a, -Acid Immunoglobulina Degree of sequence identity Gap eventb # Group glycoprotein Type Segment Direct Corrected Average required for

(residue) (residue) (%) (%) mutation alignment # # value &,-Acid Immuno-

glyco- globulin protein

Roy ~ L I He 7 I VHII

101-125 Ha AVLI

136-163 Rabbit ~ C H Mouse 104E ACL

77-99

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _

136-166 Kern hCL Eu 7 , CL

84-105 95-118‘ 48-72 _ _ - _ _ _

288--316‘ 126-157 122-153 356-388

30 60 1 .oo 1 1 44 74 0.83 1 0 36 64 1 .oo 0 0

29 72 1.04 1 0 39 68 0.93 1 0 32 71 0.97 1 0 26 68 1.06 2 0

_ _ _ _ _ _ _ _ _ _ _ - _ - _ - _ - _ - -

a These sequences were taken from Dayhoff (1972) (see also Fig. 1). A gap event is defined as the deletion or insertion of one or more amino acid residues (Dayhoff, 1972) (see also Fig. 1). ‘ Evaluated by the computer procedures A (Table 2) and B (Fig. 2 and Table 2). For further information see

text.

The sequence comparison between AGP and the immunoglobulins was carried out as follows:

1. The linear amino acid sequence of thisa- globulin was aligned by visual comparison with those of the immunoglobulins so that segments with the maximum number of identical residues and the smallest number of deletions were obtained. Twenty-two immunoglobulins were compared visually with AGP, and the seven immunoglobulins that exhibited the highest degree of sequence similarity with this a l - globulin were chosen for this presentation. The degree of sequence similarity between each pair of peptide segments was then expressed in terms of percentage of identical residues (direct sequence identity, Table l ) , percentage of identical residues resulting after allowing one mutation per amino acid codon (corrected sequence identity, Table 1) and the average number of mutations per amino acid residue required to obtain identical sequences (average mutation value (Dayhoff, 1969), Table 1).

2. The two regions of AGP that exhibited a high degree of direct and corrected sequence similarity and a low average mutation value (Table 1) were further analyzed by two com-

puter methods. In procedure A (Jukes & Cantor, 1969) for which a “comparison length” of 6 residues was used, the criterion of sequence similarity is the minimum number of base changes in the amino acid codons required to obtain sequence identity. The degree of se- quence similarity is evaluated by a probability ratio which is the quotient of the observed frequency (of a specified number of base changes per “6 residue-comparison length”) to the frequency expected if the two sequences were completely unrelated. Thus, unrelated sequences are characterized by values of the probability ratio close to 1.0. Values above unity indicate similarity, and the degree of relatedness is indicated by the numerical size of the probability ratio. To further substantiate the sequence relatedness between AGP and certain immunoglobulins a second, independent computer technique (Fitch, 1966) (procedure B) for evaluating the degree of sequence simi- larity between two proteins was employed* *.

Our computer program is based on the equation published by Fitch (1966) and is written in Basic (unpublished data).

**

43

Page 3: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

K . SCHMID, J . EMURA, M.F. SCHMID, R.L. STEVENS and R.B. NIMBERG

TABLE 2 Sequence relatedness between human plasma a , -acid glycoprotein and the immunoglobulins as compared

with that of human hemoglobina

Sequences compared (residue numbers) Number of base changes Probability per 6 residue length ratio

a,-Acid glycoprotein (77-99) vs. He 71 VHIII (95-118) 3 7.9

a, -Acid glycoprotein (136-163) vs. rabbit ~ C H (288-316) 2 60.0

Human hemoglobin a chain (1 -29) vs. p chain (1 -29) 2 6 .I

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

a For this study, procedure A was employed (see Methods).

This method is based on the measurement of the minimum number of nucleotides that must be changed in order to convert one amino acid sequence into another, in other words, the mutations required (MR) to convert one se- quence into another. For this computer analysis the total amino acid sequence of AGP was used. The results, when plotted as mutations required versus cumulative frequency of MR (%), yielded a departure from linearity in the form of a shoulder to the left in the lower part of the curve (Fig. 2). If, however, no relationship exists between two proteins a straight line is obtained.

RESULTS

Two polypeptide regions of AGP which account for 44% of the total amino acid sequence of this globulin, revealed significant degrees of sequence similarity with the variable and constant regions of several immunoglobulins.

As seen in Table 1 and Fig. 1, the first region of AGP (residues 77 through 125) exhibited a high degree of relatedness with certain sections of the variable regions of both L and H chains (Group A, Fig. 1). Regarding the direct se- quence similarity, it should be noted that for one alignment (residues 77 through 99, align- ment #2, Table 1) almost half of the residues were identical and the resulting average mu- tation value was 0.83t. The second portion of this first region (residues 101 through 125,

t The average mutation value of unrelated moteinn is . ~ ~ - - -- 1.45 and that of closely related proteins is 0.77 (Fitch, 1966).

alignment #3, Table 1) also distinguishes itself by a relatively high percentage of identical residues, a high degree of corrected similarity and a low average mutation value.

The results of the computer analysis (pro- cedure A) of the sequence comparison of alignment #2 (Fig. 1) confirmed that this segment of AGP exhibited a significant degree of relatedness with the variable region of the heavy chain of human immunoglobulin He (Table 2). The probability ratio of this compari- son was calculated to be 7.9. This value is extremely significant since a comparison by the same procedure of the first 30 residues of the a- and @-chain of human hemoglobin yielded a value of 6.7.

Evaluation of the above mentioned sequence alignment using computer method B substan- tiated the degree of similarity between the two protein segments (Fig. 2A) and confirmed that the relatedness indeed starts at residue 77 of AGP (Table 3). The curve indicated by the points in Fig. 2A deviated significantly from the straight line which would have been ex- pected if no similarity existed. For comparison, the 01- and @-chains of human hemoglobin ana- lyzed by this same technique yielded a curve that also deviated from the straight line. It is important to note, as Edelman (1970) has shown, that, if one compares with each other the variable or the constant regions of certain immunoglobulins, patterns very similar to that shown in Figs. 2A and 2B were obtained.

The second region of AGP (residues 136 through 166) was found to possess a sequence similarity with the constant regions of a A-type human and a mouse L chain and with the third

44

Page 4: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

FIG

UR

E 1

. A

lignm

ents

of aI ac

id g

lyco

prot

ein

with

the

var

iabl

e an

d co

nsta

nt r

egio

ns o

f th

e L

and

H c

hain

s of

var

ious

imm

unog

lobu

linsa

Alig

nmen

t

#

Gro

up

Prot

ein

Am

ino

acid

seq

uenc

es

A

a, -A

cid

RO

Y

He

glyc

opro

tein

VLI

VH

II *

3 H

a V

LI

80

105

C~

H~

H

P R

~L

90

KV

A

LL

I Y

120

RD

TK

TY

ML

AF

DV

ND

EK

NM

]I1"

RD

CK

RP

SG

VP

DR

FS

GS

KS

6

01

P

m

a, -A

cid

glyc

opro

teir

4 R

abbi

t C

Hb

5 M

ouse

104

E

CL

B

6 K

ern

CL

7 Eu

C

H

T:

iF

Y

E A

F

EQ

LS

S R

SV

S L

EE

LQ

AN

KA

TL

EE

MT

KN

QV

S

$ 360 L

EE

L T

DN

K

AT

L

150

140

370

4

CA

* The

am

ino

acid

seq

uenc

es o

f the

imm

unog

lobu

lins

indi

cate

d in

alig

nmen

ts 1

to 7

wer

e es

tabl

ishe

d by

Hils

chm

ann

el 01

. (19

69),

Cun

ning

ham

el a

l. (1

969)

, Sh

inod

a et

al. (

1970

), H

ill e

t 01.

(196

7),

App

ella

(197

1),

Pons

tingl

eta

l. (1

968)

and

Rut

isha

user

eta

l. (1

970)

, res

pect

ivel

y. F

or s

eque

nce

sim

ilarit

y se

e Ta

ble

1.

Eva

luat

edby

the

com

pute

r pro

cedu

res

A a

nd B

for s

eque

nce

sim

ilarit

y (F

ig. 2

and

Tab

le 2

). Fo

r fur

ther

info

rmat

ion

see

text

.

FIG

UR

E 1

A

lignm

ents

of

the

amin

o ac

id s

eque

nce

of aI ac

id g

lyco

prot

ein

with

the

am

ino

acid

seq

uenc

es o

f th

e va

riabl

e an

d co

nsta

nt r

egio

ns o

f th

e L

and

H c

hain

s of

5

sele

cted

imm

unog

lobu

lins.

Page 5: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

K . SCHMID, J. EMURA, M.F. SCHMID, R.L. STEVENS and R.B. NIMBERG

constant regions of a rabbit and the fourth constant region of a human H chain (Group B, Table 1). A significant degree of relatedness in terms of direct and corrected sequence identity and of average mutation value was established in these four comparisons (alignments 4, 5, 6

TABLE 3 Frequency of occurrence of register shifts for a, -acid glycoprotein versus a segment of the H chain of a certain human immunoglobulin

obtained by computer procedure Ba

and 7, Table 1). Using computer procedure A, an unexplainably high probability ratio of 60 for AGP versus a specific rabbit immunoglobulin was obtained (Table 2). The sequence similarity of this latter alignment was further substantiated by the computer method B which again con- firmed the degree of sequence similarity between these two protein segments (Fig. 2B) and demonstrated further that this region of AGP begins at residue # 136.

Register Frequency of Range of co-ordinates shift occurrence of a, -acid glycoprotein

7 1 3 -1 5 2 , 4 , 5 , 6 , 8 -3 2 1 0 , 1 2 -1 3 6 14, 17, to 2 0 , 2 2 -2 3 6 28 to 33 - 6 6 3 74 to 76 -68 2 6 9 , 7 0 -7 - 7b - 1 Ob 78 to 87 -1 00 1 108 -1 15 6 1 1 6 , 1 1 7 , 1 2 2 to 125 -1 26 1 128 -144 1 154 -145 2 153 ,154 -148 5 150 to 154

The length of segment examined was 15 for this comparison. The H chain refers to alignment # 2 in Fig. 1 and Table 1. * For further explanation see text.

The high frequency of occurrence of register shifts indicates that the sequence relatedness starts at residue 77.

_ -

9 0 0

5 0

0 5

,

A

MUTATIONS REQU

DISCUSSION

The present study demonstrates that approxi- mately half of the amino acid sequence of AGP possesses a significant degiee of similarity with the immunoglobulins. Pertinent to this finding are two observations of other investigators made in similar comparative sequence studies between various immunoglobulins (Gottlieb el al., 1968; Edelman, 1970; Putnam etal. , 1972; Hood, 1972; Tood, 1972). First, a large vari- ation in the degree of relatedness does exist between the immunoglobulins of a given species. Therefore, one cannot expect that all human immunoglobulins exhibit a close structural relationship with AGP. The sequences of the examined 22 immunoglobulins revealed indeed a wide range of degree of relatedness with ACP. Secondly, certain immunoglobulins of different species may possess close structural relatedness. Hence, our observations that AGP is related to the immunoglobulins of different species (human, mouse and rabbit) and that there is a large variation in the degree of sequence

I R E D (MR)

FIGURE 2 Probability plots of cumulative MR frequencies of a,-acid glycoprotein versus a segment of an H chain of IgG (align- ment #3, Table 1) (Fig. 2A) and versus the third constant region of an H chain of IgG (alignment #6, Table 1) (Fig. 2B). The length of segment examined was 15 residues.

46

Page 6: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

RELATIONSHIP OF AGP WITH THE IMMUNOGLOBULINS

similarity of this a-globulin with the human immunoglobulins are in agreement with the two mentioned observations (Hill et al., 1966; Edelman, 1970; Shinoda et al., 1970; Dayhoff, 1972; Petersen et al., 1972).

The multiple amino acid substitutions of AGP (Schmid et al., 1973), a property shared as far as is known only by the immunoglobulins, also supports the concept of an evolutionary relationship between this protein and the im- munoglobulins. Of the 21 amino acid replace- ments of AGP (Schmid et d., 1973), 12 are located in the segment that has a sequence similarity to the variable region of the immuno- globulins and three are located in the segment that has a sequence similarity to the constant regions of the immunoglobulins.

Regarding the origin of this a1 -globulin, our finding that AGP possesses two peptide seg- ments, one of which is related to the variable and the other to the constant regions of the L and H chains strongly suggests that the diver- gence of this al-globulin from the immuno- globulins most likely took place before the triplication of the constant region of the primitive L chain (Ponstingl et al., 1968; Day- hoff, 1969; Cunningham et al., 1969; Appella, 1971). Thus, the present data suggest that this glycoprotein probably evolved from the im- munoglobulin evolutionary tree by diverging from a precursor L chain before the formation of the primitive L chain (Ponstingl et al., 1968; Dayhoff, 1969; Cunningham et al., 1969; Ap- pella, 1971). Moreover, it appears that the evolution from a primitive immunoglobulin to AGP involved such changes in its structure and, hence, in its function, that the present day AGP does not exhibit, as far as known, any of the biological properties of the immunoglobulins.

In this study three independent techniques to establish the sequence relatedness between AGP and the immunoglobulins were employed. The direct and corrected sequence similarity and the average mutation value were found to be in excellent agreement with the results ob- tained from two computer techniques based on different principles. Based on the present comparison, it would appear, however, that the results derived from the visual comparative study are as useful as those obtained from the more complex computer analyses.

ACKNOWLEDGMENTS

The authors are very grateful to Dr. Howard Steinman, Department of Biochemistry, Duke University Medical Center, Durham, NC, for computer analyses with procedure A. This study was supported by a grant from National Institute of General Medical Sciences (GM-10374), United States Public Health Service.

REFERENCES

Appella, E. (1971) Proc. Natl. Acad. Sci. U.S.A. 68,

Biirgi, W. & Schmid, K. (1961) J. Biof. Chem. 236,

Cunningham, B.A., Pflumm, M.N., Rutishauser, U. & Edelman, G.M.(1969)Proc. Natl. Acad. Sci. U.S.A.

Dayhoff, M. (1969) Atlas of Protein Sequence and Structure, vol. 4, National Biomedical Research Foundation, Silver Spring, Maryland

Dayhoff, M.O. (1972) Atlas of Protein Sequence and Structure, vol. 5, National Biomedical Research Foundation, Silver Spring, Maryland

590 -5 94

1066-1074

64,997-1003

Edelman, G.M. (1970) Ann, Rev. Genetics 6,l-46 Edelman, G.M. (1970) Biochemistry 16, 3197-3205 Fitch, W. (1966) J. Mol. Biol. 16,9-16, 17-27 Gottlieb, P.D., Cunningham, B.A., Waxdahl, M.J.,

Konigsberg, W.H. & Edelman, G.M. (1968) Proc. Natl. Acad. Sci. U.S.A. 61, 168-175

Hill, R.L., Delaney, R., Fellows, R.E. & Lebowitz, H.E. (1966) Proc. Natl. Acad. Sci. U.S.A. 56, 1762-1769

Hill, R.L., Lebovitz, H.E., Fellows, R.E., Jr. & Del- aney, R. (1967) in Gamma Globulins, Nobel Symposium 3, (Killander, J., ed.), pp. 109-127, Almquist & Wiksell, Stockholm

Hilschman, N., Barnikol, H.U., Hess, M., Langer, B., Ponstingl, H., Steinmetz-Kayne, M., Suter, L. & Watanabe, S. (1969) in Proc. 5th FEBS Meeting, (Franek, F. & Shugar, D., eds.), vol. 15, pp. 57- 74, Academic Press, N.Y.

Hood, L.E. (1972) Fed. Proc. 31,177-187 Johnson, M.A., Schmid, K. & Alper, C.A. (1969) J.

Clin. Invest. 48,2 29 3 - 2 299 Jukes, T.H. & Cantor, R.C. (1969) in Mammalian Pro-

tein Metabolism (Munro, H.N., ed.), vol. 3, pp. 121-132, Academic Press, New York

Nimberg, R.B., Motoyama, T. & Schmid, K. (1971)J. Biol. Chem. 246,5817-5821

Peterson, P.A., Cunningham, B.A., Berggard, I. & Edel- man, G.M. (1972) Proc. Natl. Acad. Sci. U.S.A.

Ponstingl, H., Hess, M. & Hilschmann, N. (1968) Hoppe-Seyler’s 2. Physiol. Chem. 349,867-87 1

69,1697-1701

47

Page 7: SEQUENCE COMPARISON OF HUMAN PLASMA α1-ACID GLYCOPROTEIN AND THE IMMUNOGLOBULINS

K. SCHMID, J. EMURA, M.F. SCHMID, R.L. STEVENS and R.B. NIMBERG

Putnam, F.W., Shimizu, K., Paul, C. & Shinoda, T. (1972) Fed. Proc. 31,193-205

Rutishauser, U., Cunningham, B.A., Bennett, C., Konigsberg, W.H. & Edelman, G.M. (1970) Bio- chemistry 9,3171-3181

Schmid, K . (1975) in The Plasma Proteins (Putnam, F.W., ed.), pp. 183-228, Academic Press, New York

Schmid, K., Kaufmann, H., Isemura, S. & Bauer, F. (1973) Biochemistry 12,271 1-2724

Schmid, K., Tokita, K. & Yoshizaki, H. (1965) J. Clin. Invest. 44,1394-1401

Shinoda, T., Titani, K. & Putnam, F.W. (1970) J.

Tood, C.W. (1972) Fed. Proc. 31,188-192 Biol. Chem. 245,4475-4481

Address : Dr. Karl Schmid Department of Biochemistry Boston University School of Medicine 80 East Concord Street Boston MA 021 18 U.S.A.

48