Protein evolution The role of domains Alice Skoumalová.

Post on 12-Jan-2016

228 views 1 download

Transcript of Protein evolution The role of domains Alice Skoumalová.

Protein evolution

The role of domains

Alice Skoumalová

Definition of a domain

an independent structural, functional and evolutionary unit

1. Structural unitSelf-stabilizing locally folded region of tertiary structureCombination of motifs α-helix and β-sheetMost proteins have 2 and more domains

2. Functional unitVarious functionsLigands binding, membrane transit, catalytic activity,

DNA binding, protein-protein interaction, etc.An independent function, cooperation with other domains

3. Evolutionary unitThe relationship of proteins(superfamilies formation)The „family tree“SCOP (Structural Classification of Proteins) with 1200 protein

superfamilies

The creation of new proteins

duplication, divergence and recombination of domains

new function (sequence divergence, combining with other domains)

This mechanism facilitates the creation of proteins from different protein domains (no need of new genes for the formation of new proteins)

The recombination of domainsTwo generic principles:

A domain can perform the same function, but in different protein contexts (with different

partner domains)

Syntactical shift

A domain can diverge and aquire

a novel or modified function

Semantic shift

Transcription factor FadR

WHD(Winged helix domain)

Oligomerisation/CoA-binding domain

Restriction endonuclease Fokl

WHD

Catalytic domain

Human methionine aminopeptidase

WHD

Creatinase/aminopeptidase domain

The creation of new proteinsby the domain recombination (an

example)

Proteins that participate in the haemostasis

form the superfamily of the related proteins (duplication, reception or deletion of the specific domains)

contain a domain that is homologous to trypsin (they have a common ancestor with trypsin)

the family tree can be generated with 7 gene modules

P

PK

Ancestral protein

Trypsin-like serine protease Kringle addition A modul which codes the structure called kringle

Parent of all proteins

K P

P

K

K

K

K

K K

PE

P

P

P

P

E

E

EE

E

F2

F2

F2

F2

F1

F1

EGF domain addition

Urokinase

Fibronectin domain 2 addition

Fibronectin domain 1 addition

EGF domain duplication

Kringle duplication

t-PA

Factor XII

K P

P

P

P

P

P

K

K

K K

K

K

K

K

K K

PrPE

P

P

P

P

E

E

EE

E

F2

F2

F2

F2

F1

F1

Pr

Pr

Pr

Pr

C

C

C E E

C P

Propeptide addition

Calcium binding domain addition

Kringle duplication

Kringle deletion

2 EGF domains addition

Prothrombin

Factors VII, IX, X

Protein C

Urokinase

t-PA

Factor XII

K P

P

P

P

P

P

K

KK K K K

K K K K K

KK K K K

K K

K

K

K

K

K K

PrPE

P

P

P

P

P

P

P

E

E

EE

E

F2

F2

F2

F2

F1

F1

Pr

Pr

Pr

Pr

C

C

C E E

C P

K

Repeat kringle duplication

Kringle duplication

Hepatocyte growth factor

Plasminogen

Apolipoprotein (a)

Constits of 40 kringles

Urokinase

t-PA

Factor XII

Prothrombin

Factors VII, IX, X

Protein C

From the example above we can deduce:

The relationship of the haemostatic proteins is an example of the universal principle of the new protein creation

Simple arithmetic operations with gene modules facilitate the creation of new proteins with different properties

Summary

There is no simple relation: 1 gene - 1 protein One gene can produces more proteins (various conformations, various

domain recombination) Duplication, divergence and recombination of domains are crucial for

the protein creation (there si no need of new genes for the new proteins formation)

An example of relationship of proteins participating in the haemostasis

Proteomics

What is proteomics?The large-scale study of proteins

Proteom Genom

All proteins produced by an organism

The human body contains millions proteins

One organism has different protein expression in different parts of its body, stages of its life cycle and environmental conditions

All genes in DNA of an organism

The human genome contains 20-25000 genes

The genom is a constant entity

Expression

+posttranslational modification

+alternative splicing

+alternative folding

Proteomics Genomics

PROTEin+genOME

Increase in protein diversity Posttranslational modification Alternative splicing Alternative folding

Primary transcript mRNA before the posttranscriptional

modification

Posttranslational modification

Alternative splicing

Alternative folding

Posttranslational modification

The chemical modification of a protein after its translation

1. Addition of functional groups (acetate, phosphate, lipids, carbohydrates)

2. Modification of amino acids

3. Structural changes ( the formation of disulfide bridges, proteolytic cleavage)

Alternative splicing

of a pre-mRNA transcribed from one gene can lead to different mature mRNA molecules and therefore to different proteins

Alternative folding

The protein folding proceeds from a disordered state to progressively more ordered conformations corresponding to lower energy levels

Global minimum(native state)

Local minimum(alternative conformation)

Basic proteomic analysis scheme

Protein mixture

Individual proteins

Peptides

Peptide mass

Protein identification

1. Separation

2D-PAGE

2. Spot cutting

Trypsin digestion

3. Mass analysis

Mass spectroscopy

5. Database search

4. Sequence analysis

Peptide fragmentation

Sequence information

2D gel electrophoresis

The synchronous analysis of hundreds or even thousands of proteins

Proteins spread out on the surface

Application of proteomics in medicine (disease proteomics)

Design of new drugs

Biomarkers of diseases

Protein expression in diseasesThe role of proteins in the pathogenesis of diseases

Using specific protein biomarkers to diagnose disease

Alzheimer disease (amyloid β)

Heart disease (interleukin-6 and 8, serum amyloid A, fibrinogen, troponins)

Renal cell carcinoma (carbonic anhydrase IX)

Information about proteins causing diseases is used for the identification of potential new drugs

Proteome-based plasma biomarkers for AD

Diagnosis of AD

On clinical grounds+post mortem (histology)

There is no reliable diagnostic test

Plasma may offer a rich source of disease biomarkers

Identification of diagnostic biomarkers in the blood by proteomics

Plasma samples of patients and control were analysed by 2D gel electrophoresis

Spots that were significantly different between case and control groups were excised and analysed by mass spectroscopy

Biomarkers of diseases

Results

15 spots were significantly different between patients and controls

MS analysis: 2-macroglobulin, complement factor H, …

Virtual ligand screening

The identification of new drugs to target and inactivate the HIV-1 protease

(cleaves a very large HIV protein into smaller, functional proteins; virus cannot survive without this enzyme; it is one of the most effective protein targets for killing HIV)

Summary

Proteomics studies proteins, particularly their structure, function and interaction

The genome has already been analysed, now scientists are interested in the human proteome (millions of proteins)

Key technologies used in proteomics are 2D gel electrophoresis and mass spectrometry

Proteins play a central role in the life of an organism, their malfunction startes diseases; proteomics is instrumental in discovery of

pathogenesis of disease, biomarkers and potential therapetic agents

Questions

1. Definition of a domain (3 aspects), mechanisms of the new protein creation (in general), syntactical and semantic shift (the principle)

2. Increase in protein diversity compared to genom

3. Identification of the renal carcinoma biomarkers in the plasma

4. Using of computer sofware for the development of new drugs