Basic structures of proteins. Structural Hierarchy of Protein Primary structure Fold (Scaffold)...

Post on 17-Jan-2016

247 views 0 download

Transcript of Basic structures of proteins. Structural Hierarchy of Protein Primary structure Fold (Scaffold)...

Basic structures of proteins

Structural Hierarchy of Protein

Primary structure

Fold (Scaffold)Motif

Secondary structureFunctional element A

Functional element B

Functional element C

.Functional elements : α-helix, strands, β-sheet, loops.

- Structure, affinity, activity, specificity, stability etc.

Structural motifs: super secondary structure

• β hairpin: - Extremely common.

- Two antiparallel β strands connected by a tight turn of a few amino acids between them.

• Greek key: a decorative border constructed from a continuous line, shaped into a repeated motif

- 4 β strands folded over into a sandwich shape

• Commonly recurring substructures

• Connectivity between secondary structural elements.

• An individual motif usually consists of only a few elements.

• Motifs do not allow us to predict the biological functions: they are found in proteins and enzymes with dissimilar functions

- Fold: Arrangement of secondary structure elements of the structure (a total of 1282 folds)

- Loops: Irregularly folded segments of polypeptide chain that connect the helixes and sheets - Usually exposed to solvent and are short

• Omega loop: - A loop in which the residues that make up the beginning and end of the loop are very close together

• Helix-loop-helix: - Consists of α helices bound by a looping stretch of amino acids. This motif is seen in transcription factors.

• Zinc finger: -Two β strands with an α helix end folded over to bind a zinc ion. Important in DNA binding proteins.

• Helix-turn-helix: - Two α helices joined by a short strand of amino acids and is found in many proteins that regulate gene expression.

• Nest: - Extremely common. Just three consecutive amino acid residues form an anion-binding concavity.

• Niche: - Extremely common. Just three consecutive amino acid residues form a cation-binding feature.

Named after a pattern common to Greek ornamental art-work

- A part of a protein that can exist independently of the rest of the protein chain- Functional aspect

Domains

Assembly of proteins from building blocks

• Classification by the packing

- α/ α (all α)

- ß/ß (all ß )

- α/ ß: α and ß elements are in a mixed order in the sequence

- α + ß: α and ß elements are segregated in the sequence

Layered sandwich structures, with each layer consisting of either α helixes or ß sheets

Packing of secondary structure

• Hydrophobic effect: Major driving force for the folding of proteins

- Burying and clustering of hydrophobic side chains to minimize their contacts with water

• Basic requirements for folding: 1. Compact structure and minimization of hydrophobic surface area

2. Buried hydrogen bonding groups are all paired

- Formation of a helixes and b sheets maximizes the pairing of the hydrogen bonding groups of the backbone

- Packing of a helixes and b sheets by stacking their amino acid side chains

• Packing density : Protein : ~ 0.75

Crystals : 0.7 ~ 0.78

Close-packed spheres : 0.74

Infinite cylinders : 0.91

• Quaternary structure: Overall organization of subunits

- Contact interface of the subunits are closely packed as the

protein interiors

- Charged and hydrogen-bonding groups on the surface are

paired with complementary partners

Definition of the terms

• Homologue : Evolved from a common ancestral protein Their evolutionary relationship is evident from similarities in sequence, structure and/or function

• Analogue : Proteins that are similar in some way, yet show no evidence of a common ancestry. Structural analogues share the same fold, and functional analogues perform the same function

• Paralogue : evolved by gene duplication within a genome and have a distinct, but usually related func-tion

• Orthologue : Equivalent genes in different species that evolved from a common ancestor by speciation

Duplication & modification

fine-tuningdiversification

E1

primitiveProteins/ enzymes

superfamilyproteins/enzymes

progenitor

E2

E3

E4

E5undifferentiated

scaffold

Drastic gene rearrangementInsertion, deletion, and substitution of gene segments, point mutations Fine-tuning

Accumulation of point mutations

- ~ 50,000 proteins - Structural analysis : ~1,000 scaffolds- Use of a limited repertoire of scaffolds for large diversity of functions

Protein evolution and diversity

Sequence space Sequence space

Fitn

ess

Same superfamily

Same fold

same family

⑥①②

① Incremental improvement of protein property: specificity, activity, stability, expression② Divergent evolution within family: substrate/cofactor specificity, enantio-selectivity③ Divergent evolution within superfamily: αβ hydrolase, enolase, crotonase superfamily ④ Divergent evolution within fold: alteration of sub-binding/catalytic machineries⑤ Convergent evolution between folds: grafting sub-binding/catalytic machineries into different fold⑥ Directed evolution: find optimum fitness

Different superfamilyDifferent family

Sequence space and evolution of proteins

ChymotrypsinElastase

Trypsin Subtilisin

Divergent and convergent evolution of proteins : serine proteases• Mammalian serine proteases: Common tertiary structure and function Superimposable polypeptide backbones • About 60 % of the amino acids in the interior, but 10 % of the surface residues , are conserved • Catalytic triad of residues :Asp-His-Ser• Different substrate specificity

Catalytic property

• Nucleophile: hydroxyl oxygen of Ser• Formation of acyl-enzyme through esterification of the hydroxyl of the reactive serine

by the carboxyl portion of the substrate

• Major difference in substrate specificity from changes in three loops forming the lining of the binding pocket

- Chymotrypsin suitable for large hydrophobic side chains of Phe, Tyr, and Trp

Small residues at the binding pocket

- Trypsin: Negatively charged aspartate at the bottom

Forms a salt linkage with the positively charged ammonium or guanidinium such as Lys and Arg

- Elastase: Bulky Val and Thr at the entry of the pocket

prevent the entry of large side chains into the pocket

suitable for small hydrophobics like Ala

• Catalytic triad: Arg-His-Ser

Catalytic mechanism of serine protease

• Binding site of the enzymes : approximately complementary to the structures of the substrates• Interactions : Non-polar parts of the substrate match up with non-polar side chains of the amino acids• Hydrogen-bonding sites on the substrates bind to the backbone NH and CO groups of the protein• Reactive part of the substrate is firmly held by this binding next to acidic, basic, or nucleophilic groups on the enzyme

Provide a strategy and insight to engineering and design of enzymes

Structural and mechanistic information

Convergent or divergent evolution ?

• Criteria for evolution from a common ancestor : Descending order of strength

- DNA sequences coding for enzymes are similar? - Amino acid sequences are similar - Three-dimensional structures are similar? - Enzyme-substrate interactions are similar? - Catalytic mechanisms are similar? - Segments of polypeptide chain essential for catalysis are in the same sequence?

• Mammalian serine proteases: Divergent evolution • Catalytic mechanism with subtilisin : Convergent evolution Three-dimensional structure is more conserved than primary structure but function has changed

α/ß barrel protein (or Tim Barrel): Convergent evolution

- Eight parallel ß strands connected by eight helixes- Strands form the staves of the barrel while the helixes are on the outside and parallel- Hydrophobic core composed of the side chains of strands, Val, Ile, Leu- Catalyze a variety of reactions and have diverse subunit compositions- No homology with the enzymes that catalyze the same reaction in different organisms - Active site: eight loops connecting the carboxyl end of each strand- Little sequence identity, and active sites use different regions of the loops

Relationship between three-dimensional structure and similarity of sequence

Dependent on the length of the protein - The longer the protein, the lower the percent identity that implies similar structure - The higher sequence identity, the lower the RMSD between proteins - For a protein of 85 residues, 25~ 30 % sequence identity implies identical fold.

Multi-enzyme complexes

• Encoded in a single polypeptide chain• Involved in sequential steps in a biosynthetic pathway or complex biochemical process - Tryptophan synthase : tetramer (α2ß2)

- Polyketide synthase

- Fatty acid synthase

• Advantages of multi-enzyme complexes - Enhanced catalytic efficiency:

Reduced diffusion time of an intermediate

Substrate channeling :

ex) Tryptophan synthase: Reaction intermediate (indol) is not released, but shuttled directly

between the subunits through 20 ~ 30 A long channel, and directed to the next reaction

- Sequestration of reactive intermediates: protection of chemically unstable intermediates from water

- Easy coordination for regulating the reaction

- Easy coordination of expression during biosynthesis

Polyketide synthases(PKSs)

• Polyketide: a large class of diverse compounds that are characterized by more than two carbonyl groups connected by single intervening carbon atoms

• PKSs: A family of multi-domain enzymes or enzyme complexes that produce polyketides, a large class of secondary metabolites: ex) Antibiotics(tetracyclin and macrolides, erythromycin), Anticholesterol drug (lovastatin) Immunosuppressant(sirolimus), Anticancer drug: epothilone B

• Share striking similarities with fatty acid biosynthesis

• The PKS genes are usually organized in one operon in bacteria and in gene clusters in eukaryotes

The order of modules and domains of a complete polyketide-synthase• Starting or loading module: AT-ACP - Starter group, usually acetyl-CoA or malonyl-CoA, is loaded onto the ACP domain of the starter module catalyzed by the starter module's AT domain

• Elongation or extending modules: KS-AT-[DH-ER-KR]-ACP- - Polyketide chain is handed over from the ACP domain of the previous module to the KS domain of the current module, catalyzed by the KS domain

• Termination or releasing module: TE - TE (thio-esterase) domain hydrolyzes the completed polyketide chain from the ACP-domain of the previous module

Flexibility and conformational mobility of proteins • Flexibility of proteins even though globular proteins are closely packed • Undergo conformational changes on binding ligands or substrates• Conformational changes play an important role in a certain class of enzymes (allosteric) for modulating activity

- Allosteric effectors: alter the shape of the protein : Hemoglobin • Equilibrium among two or more conformations of the protein in solution

Modes of motion and flexibility of proteins

• Molecular tumbling

- Globular proteins rotate in solution at frequencies close to those calculated for rigid sphere

- Rotational correlation time (ϕ) : Time taken to rotate through a defined angle

Reciprocal of the rate constant for the randomization of the orientation of the molecule by Brownian motion

- For a rigid sphere, ϕ = V ƞ /kT

V: molecular volume, ƞ: viscosity of the medium, k: Boltzmann’s constant, T: absolute temperature Approximately, ϕ = Mr/2000 ns, Mr: molecular mass of a globular protein ex) Chymotrypsin(Mr= 25,000), ϕ= 12 ns

Allostery: A Phenomenon in which binding of a substrate, product, or other effector to a subunit of a multi-subunit enzyme at a site (allosteric site) other than the functional site alters its conformation and functional properties.

Rotation of side chains

• NMR : most powerful technique for studying the mobility of individual amino acids • Measurement of rotational freedom of the aromatic side chains of Tyr and Phe about the Cβ-Cγ bond: H1 NMR - Detect whether or not the aromatic ring is constrained in an anisotropic environment - Slow rotation: 1 ~ 10 /s - Fast rotation: 10 4 ~ 10 5/s• Surface amino acids are more mobile than interior ones, showing no unique conformation

Domain movement: hinge motion and segmental flexibility

• Larger scale movement in proteins with low energy barriers• Hinge motion: - Two elements of structure undergo open and closed conformation as if connected by a hinge - MBP, Abl protein kinase (N-lobe and C-lobe) - Detection: time-resolved fluorescence polarization spectroscopy, NMR

• Incorporation of 15N into the protein Analysis of relaxation of 15N-NMR signals• The term ‘relaxation’ describes how signals change with time. - Signals deteriorate with time, becoming weaker and broader - The deterioration reflects that NMR signal arises from the over-population of an excited state, fluctuation in backbone structure

Protein mobility in solution